Method of nucleic acid amplification

ABSTRACT

A nucleic acid molecule can be annealed to an appropriate immobilized primer. The primer can then be extended and the molecule and the primer can be separated from one another. The extended primer can then be annealed to another immobilized primer and the other primer can be extended. Both extended primers can then be separated from one another and can be used to provide further extended primers. The process can be repeated to provide amplified, immobilized nucleic acid molecules. These can be used for many different purposes, including sequencing, screening, diagnosis, in situ nucleic acid synthesis, monitoring gene expression, nucleic acid fingerprinting, etc.

CROSS-REFERENCE TO RELATED APPLICATIONS

This application is a continuation of U.S. application Ser. No.14/465,685, filed Aug. 21, 2014, which is a continuation of U.S.application Ser. No. 13/886,234, filed May 2, 2013, which is acontinuation of U.S. application Ser. No. 13/828,047, filed Mar. 14,2013, which is a continuation of U.S. application Ser. No. 12/148,133,filed Apr. 16, 2008, now U.S. Pat. No. 8,476,044, which is a divisionalof U.S. application Ser. No. 10/449,010, filed Jun. 2, 2003, now U.S.Pat. No. 7,985,565, which is a continuation of U.S. application Ser. No.09/402,277, filed Sep. 30, 1999, now abandoned, which is a nationalstage under 35 USC 371 of International Application No. PCT/GB98/00961,filed Apr. 1, 1998, which claims the benefit of priority to GB9706528.8, filed Apr. 1, 1997, GB 9706529.6, filed Apr. 1, 1997, GB9713236.9, filed Jun. 23, 1997, and GB 9713238.5, filed Jun. 23, 1997.The contents of U.S. application Ser. No. 12/148,133, filed Apr. 16,2008, Ser. No. 10/449,010, filed Jun. 2, 2003, Ser. No. 09/402,277,filed Sep. 30, 1999, and International Application No. PCT/GB98/00961,filed Apr. 1, 1998, are hereby incorporated by reference in theirentirety.

SEQUENCE LISTING

The present application is being filed along with a Sequence Listing inelectronic format. The Sequence Listing is provided as a file entitledILLINC220C5.TXT, created Aug. 20, 2014, which is 3.65 Kb in size. Theinformation in the electronic format of the Sequence Listing isincorporated herein by reference in its entirety.

FIELD

This invention relates, inter alia, to the amplification of nucleicacids.

BACKGROUND

Molecular biology and pharmaceutical drug development now make intensiveuse of nucleic acid analysis (Friedrich, G. A. Moving beyond the genomeprojects, Nature Biotechnology 14, 1234 (1996)). The most challengingareas are whole genome sequencing, single nucleotide polymorphismdetection, screening and gene expression monitoring. Currently, up tohundreds of thousands of samples are handled in single DNA sequencingprojects (Venter, J. C., H. O Smith, L. Hood, A new strategy for genomesequencing, Nature 381, 364 (1996)). This capacity is limited by theavailable technology. Projects like the “human genome project” (genemapping and DNA sequencing) and identifying all polymorphisms inexpressed genes involved in common diseases imply the sequencing ofmillions of DNA samples.

With most of the current DNA sequencing technologies, it is simply notpossible to decrease indefinitely the time required to process a singlesample. A way of increasing throughput is to perform many processes inparallel. The introduction of robotic sample preparation and delivery,96 and 384 well plates, high density gridding machines (Maier, E., S.Meierewer, A. R. Ahmadi, J. Curtis, H. Lehrach, Application of robotictechnology to automated sequence fingerprint analysis by oligonucleotidehybridization, Journal Of Biotechnology 35, 191 (1994)) and recently thedevelopment of high density oligonucleotide arrays (Chee, M., R. Yang,E. Hubbell, A. Berno, X. C. Huang, D. Stern, J. Winkler, D. J. Lockhart,M. S. Morris, and S. P. A. Fodor, Accessing genetic information withhigh-density DNA arrays, Science 274(5287):610-614, (1996)) are startingto bring answers to the demand in ever higher throughput. Suchtechnologies allow up to 50,000-100,000 samples at a time to beprocessed within days and even hours (Maier, E., Robotic technology inlibrary screening, Laboratory Robotics and Automation 7, 123 (1995)).

In most known methods for performing nucleic acid analysis, it isnecessary to first extract the nucleic acids of interest (e.g., genomicor mitochondrial DNA or messenger RNA (mRNA)) from an organism. Then itis necessary to isolate the nucleic acids of interest from the mixtureof all nucleic acids and usually, to amplify these nucleic acids toobtain quantities suitable for their characterisation and/or detection.Isolating the nucleic fragments has been considered necessary even whenone is interested in a representative but random set of all of thedifferent nucleic acids, for instance, a representative set of all themRNAs present in a cell or of all the fragments obtained after genomicDNA has been cut randomly into small pieces.

Several methods can be used to amplify DNA with biological means and arewell known by those skilled in the art. Generally, the fragments of DNAare first inserted into vectors with the use of restriction enzymes andDNA ligases. A vector containing a fragment of interest can then beintroduced into a biological host and amplified by means of wellestablished protocols. Usually hosts are randomly spread over a growthmedium (e.g. agar plates). They can then replicate to provide coloniesthat originated from individual host cells.

Up to millions of simultaneous amplification of cloned DNA fragments canbe carried out simultaneously in such hosts. The density of colonies isof the order of 1 colony/mm². In order to obtain DNA from such coloniesone option is to transfer the colonies to a membrane, and then toimmobilise the DNA from within the biological hosts directly to themembrane (Grunstein, M. and D. S. Hogness, Colony Hybridization: Amethod for the isolation of cloned DNAs that contain a specific gene,Proceedings of the National Academy of Science, USA, 72:3961 (1975)).With these options however, the amount of transferred DNA is limited andoften insufficient for non-radioactive detection.

Another option is to transfer by sterile technique individually eachcolony into a container (e.g., 96 well plates) where further host cellreplication can occur so that more DNA can be obtained from thecolonies. Amplified nucleic acids can be recovered from the host cellswith an appropriate purification process. However such a procedure isgenerally time and labour consuming, and difficult to automate.

The revolutionary technique of DNA amplification using the polymerasechain reaction (PCR) was proposed in 1985 by Mullis et al. (Saiki, R.,S. Scharf, F. Faloona, K. Mullis, G. Horn, H. Erlich and N. Arnheim,Science 230, 1350-1354 (1985) and is now well known by those skilled inthe art. In this amplification process, a DNA fragment of interest canbe amplified using two short (typically about 20 base long)oligonucleotides that flank a region to be amplified, and that areusually referred to as “primers”. Amplification occurs during the PCRcycling, which includes a step during which double stranded DNAmolecules are denatured (typically a reaction mix is heated, e.g. to 95°C. in order to separate double stranded DNA molecules into two singlestranded fragments), an annealing step (where the reaction mix isbrought to e.g., 45° C. in order to allow the primers to anneal to thesingle stranded templates) and an elongation step (DNA complementary tothe single stranded fragment is synthesised via sequential nucleotideincorporation at the ends of the primers with the DNA polymeraseenzyme).

The above procedure is usually performed in solution, whereby neitherthe primers nor a template are linked to any solid matrix.

More recently, however, it has been proposed to use one primer graftedto a surface in conjunction with free primers in solution in order tosimultaneously amplify and graft a PCR product onto the surface(Oroskar, A. A., S. E. Rasmussen, H. N. Rasmussen, S. R. Rasmussen, B.M. Sullivan, and A. Johansson, Detection of immobilised amplicons byELISA-like techniques, Clinical Chemistry 42:1547 (1996)). (The term“graft” is used herein to indicate that a moiety becomes attached to asurface and remains there unless and until it is desired to remove it.)The amplification is generally performed in containers (e.g., in 96 wellformat plates) in such a way that each container contains the PCRproduct(s) of one reaction. With such methods, some of the peR productbecome grafted to a surface of the container having primers thereinwhich has been in contact with the reactant during the PCR cycling. Thegrafting to the surface simplifies subsequent assays and allowsefficient automation.

Arraying of DNA samples is more classically performed on membranes(e.g., nylon or nitro-cellulose membranes). With the use of suitablerobotics (e.g., Q-bot™, Genetix ltd, Dorset BH23 3TG UK) it is possibleto reach a density of up to 10 samples/mm². Here, the DNA is covalentlylinked to a membrane by physicochemical means (e.g., UV irradiation).These technologies allow the arraying of large DNA molecules (e.g.molecules over 100 nucleotides long) as well as smaller DNA molecules.Thus both templates and probes can be arrayed.

New approaches based on pre-arrayed glass slides (arrays of reactiveareas obtained by ink-jet technology (Blanchard, A. P. and L. Hood,Oligonucleotide array synthesis using ink jets, Microbial andComparative Genomics, 1:225 (199)) or arrays of reactive polyacrylamidegels (Yershov, G. et al., DNA analysis and diagnostics onoligonucleotide microchips, Proceedings of the National Academy ofScience, USA, 93:4913-4918 (1996)) allow the arraying of up to 100samples/mm². With these technologies, only probe (oligonucleotide)grafting has been reported. Reported number of samples/mm² are stillfairly low (25 to 64).

Higher sample densities are achievable by the use of DNA chips, whichcan be arrays of oligonucleotides covalently bound to a surface and canbe obtained with the use of micro-lithographic techniques (Fodor, S. P.A. et al., Light directed, spatially addressable parallel chemicalsynthesis, Science 251:767 (1991)). Currently, chips with 625 probes/mm²are used in applications for molecular biology (Lockhart, D. J. et al.,Expression monitoring by hybridisation to high-density oligonucleotidearrays, Nature Biotechnology 14:1675 (1996)). Probe densities of up to250 000 samples/cm² are claimed to be achievable (Chee, M. et al.,Accessing genetic information with high-density DNA arrays, Science274:610 (1996)). Currently, up to 132000 different oligonucleotides canbe arrayed on a single chips of approximately 2.5 cm². Presently, thesechips are manufactured by direct solid phase oligonucleotide synthesiswith the 3′OH end of the oligo attached to the surface. Thus these chipshave been used to provide oligonucleotide probes which cannot act asprimers in a DNA polymerase-mediated elongation step.

When PCR products are linked to the vessel in which PCR amplificationtakes place, this can be considered as a direct arraying process. Thedensity of the resultant array of PCR products is then limited by theavailable vessel. Currently available vessels are only in 96 wellmicrotiter plate format. These allow only around ˜0.02 samples of PCRproducts/mm² of surface to be obtained.

Using the commercially available Nucleolink™ system obtainable from NuncA/S (Roskilde, Denmark) it is possible to achieve simultaneousamplification and arraying of samples in containers on the surface ofwhich oligonucleotide primers have been grafted. However, in this casethe density of the array of samples is fixed by the size of the vessel.Presently a density of 0.02 samples/mm² is achievable for the 96 wellplate format. Increasing this density is difficult. This is apparentsince, for instance, the availability of 384 well plates (0.08samples/mm²) suitable for PCR has been delayed due to technical problems(e.g. heat transfer and capillary effects during filling). It is thusunlikely that orders of magnitude improvements in the density of samplesarrayed with this approach can be achieved in the foreseeable future.

SUMMARY

The present invention aims to overcome or at least alleviate some of thedisadvantages of prior art methods of nucleic acid amplification.

According to the present invention there is provided a method of nucleicacid amplification, comprising the steps of:

-   -   A. providing a plurality of primers that are immobilised but        that have one end exposed to allow primer extension;    -   B. allowing a single stranded target nucleic acid molecule to        anneal to one of said plurality of primers over part of the        length of said single stranded nucleic acid molecule and then        extending that primer using the annealed single stranded nucleic        acid molecule as a template, so as to provide an extended        immobilised nucleic acid strand;    -   C. separating the target nucleic molecule from the extended        immobilised nucleic acid strand;    -   D. allowing the extended immobilised nucleic acid strand to        anneal to one of said plurality of primers referred to in        step A) and then extending that primer using the extended        immobilised nucleic acid strand as a template, so as to provide        another extended immobilised nucleic acid strand; and        optionally,    -   E. separating the annealed extended immobilised nucleic acid        strands from one another.

Preferably the method also comprises the step of:

F. using at least one extended immobilised nucleic acid strand to repeatsteps D) and E), so as to provide additional extended immobilisednucleic acid strands and, optionally,G. repeating step F) one or more times.

Desirably the single-stranded target nucleic acid sequence is providedby a method in which said single-stranded target nucleic acid isproduced by providing a given nucleic acid sequence to be amplified(which sequence may be known or unknown) and adding thereto a firstnucleic acid sequence and a second nucleic acid sequence; wherein saidfirst nucleic acid sequence hybridises to one of said plurality ofprimers and said second nucleic acid sequence is complementary to asequence which hybridises to one of said plurality of primers.

The second nucleic acid sequence may be a sequence that is the same asthe sequence of one of the plurality of primers. Thus thesingle-stranded target nucleic acid sequence may be provided by a methodin which said single-stranded target nucleic acid is produced byproviding a given nucleic acid sequence to be amplified (which sequencemay be known or unknown) and adding thereto a first nucleic acidsequence and a second nucleic acid sequence; wherein said first nucleicacid sequence hybridises to one of said plurality of primers and saidsecond nucleic acid sequence is the same as the sequence of one of saidplurality of primers.

The first and second nucleic acid sequences may be provided at first andsecond ends of said single-stranded target nucleic acid, although thisis not essential.

If desired a tag may be provided to enable amplification products of agiven nucleic acid sequence to be identified.

BRIEF DESCRIPTION OF THE DRAWINGS

FIGS. 1A and 1B illustrate a method for the simultaneous amplificationand immobilisation of nucleic acid molecules using a single type ofprimer.

FIG. 2 illustrates how colony growth using a method of the presentinvention can occur.

FIGS. 3A AND 3B illustrate the principle of the method used to produceDNA colonies using the present invention. FIG. 3A illustrates theannealing, elongation and denaturing steps that are used to provide suchcolonies.

FIG. 3B illustrates a priming step.

FIG. 4 is an example of DNA colonies formed by amplification of aspecific template with single primers grafted onto a surface.

FIG. 5 is an example of DNA colonies formed by amplification of aspecific template with single primers grafter onto a surface.

FIGS. 6A and 6B illustrate a method for the simultaneous amplificationand immobilisation of nucleic acid molecules using two types of primer.

FIG. 7 shows actual DNA colonies produced via the present invention.

FIG. 8 shows actual DNA colonies produced via the present invention.

FIGS. 9A and 9B illustrate a method of the simultaneous amplificationand immobilisation of nucleic acid molecules when a target molecule isused as a template having internal sequences that anneal with primers.

FIGS. 10A and 10B illustrate a method to synthesise additional copies ofthe original nucleic acid strands using nucleic acid strands present incolonies. The newly synthesised strands are shown in solution but can beprovided in immobilised form if desired.

FIGS. 11A and 11B show the PCR amplification of DNA from DNA found inthe pre-formed DNA colonies.

FIGS. 12A, 12B and 12C illustrate how secondary primers can be generatedfrom DNA colonies.

FIGS. 13A and 13B illustrate how secondary DNA colonies can be generatedfrom secondary primers.

FIGS. 14A and 14B illustrate how primers with different sequences can begenerated from a surface functionalised with existing primers.

FIG. 15 depicts methods of preparing DNA fragments suitable forgenerating DNA colonies.

FIG. 16 illustrates a method for synthesising cRNA using the DNA colonyas a substrate for RNA polymerase.

FIG. 17 illustrates a preferable method to determine the DNA sequence ofDNA present in individual colonies.

FIG. 18 illustrates a method of determining the sequence of a DNAcolony, de novo.

FIG. 19 illustrates the utility of secondary DNA colonies in the assayof mRNA expression levels.

FIG. 20 illustrates the use of the secondary DNA colonies in theisolation and identification of novel and rare expressed genes.

FIG. 21 illustrates the use of the secondary DNA colonies in theisolation and identification of novel and rare expressed genes.

DETAILED DESCRIPTION Colonies

The method of the present invention allows one or more distinct areas tobe provided, each distinct area comprising a plurality of immobilisednucleic acid strands (hereafter called “colonies”). These areas cancontain large numbers of amplified nucleic acid molecules. Thesemolecules may be DNA and/or RNA molecules and may be provided in singleor double stranded form. Both a given strand and its complementarystrand can be provided in amplified form in a single colony.

Colonies of any particular size can be provided.

However, preferred colonies measure from 10 nm to 100 μm across theirlongest dimension, more preferably from 100 nm to 10 μm across theirlongest dimension. Desirably a majority of the colonies present on asurface (i.e. at least 50% thereof) have sizes within the ranges givenabove.

Colonies can be arranged in a predetermined manner or can be randomlyarranged. Two or three dimensional colony configurations are possible.The configurations may be regular (e.g. having a polygonal outline orhaving a generally circular outline) or they may be irregular.

Colonies can be provided at high densities. Densities of over onecolony/mm² of surface can be achieved. Indeed densities of over 10²,over 10³ or even over 10⁴ colonies/mm² are achievable using the presentinvention. In preferred embodiments, the present invention providescolony densities of 10⁴⁻⁵ colonies/mm², more preferably densities of10⁶⁻⁷ colonies/mm², thus offering an improvement of 3 to 4 orders ofmagnitude relative to densities achievable using many of the prior artmethods. It is this property of the invention that allows a greatadvantage over prior art, since the high density of DNA colonies allowsa large number of diverse DNA templates (up to 10⁶⁻⁷ colonies/mm², to berandomly arrayed and amplified.

Primers

The immobilised primers for use in the present invention can be providedby any suitable means, as long as a free 3′-OH end is available forprimer extension. Where many different nucleic acid molecules are to beamplified, many different primers may be provided. Alternatively“universal” primers may be used, whereby only one or two different typesof primer (depending upon the embodiment of the invention) can be usedto amplify the different nucleic acid molecules. Universal primers canbe used where the molecules to be amplified comprise first and secondsequences, as described previously. The provision of universal primersis advantageous over methods such as those disclosed in WO96/04404(Mosaic Technologies, Inc.) where specific primers must be prepared foreach particular sequence to be amplified.

Synthetic oligodeoxynucleotide primers are available commercially frommany suppliers (e.g. Microsynth, Switzerland, Eurogentech, Belgium).

Grafting of primers onto silanized glass or quartz and grafting ofprimers onto silicon wafers or gold surface has been described (Maskos,U. and E. M Southern, Oligonucleotide hybridizations on glass supports:a novel linker for oligonucleotide synthesis and hybridizationproperties of oligonucleotides synthesised in situ, Nucleic AcidsResearch 20(7):1679-84, 1992; Lamture, J. B., et. al. Direct-detectionof nucleic-acid hybridization on the surface of a charge-coupled-device,Nucleic Acids Research 22(11):2121-2125, 1994; Chrisey, L. A., G. U.Lee, and C. E. Oferrall, Covalent attachment of synthetic DNA toself-assembled monolayer films, Nucleic Acids Research 24(15):3031-3039,1996).

Grafting biotinylated primers to supports covered with streptavidin isanother alternative. This grafting method is commonly used forbio-macromolecules in general.

Non-covalent grafting of primers at the interface between an aqueousphase and a hydrophobic phase through an hydrophobic anchor is alsopossible for the present invention. Such anchoring is commonly used forbio-macromolecules in general (S. Terrettaz et al.: Protein binding tosupported lipid membranes, Langmuir 9, 1361 (1993)). Preferred forms ofsuch interfaces would be liposomes, lipidic vesicles, emulsions,patterned bilayers, Langmuir or Langmuir-Blodgett films. The patternsmay be obtained by directed pattering on templates, e.g., silicon chipspatterned through micro-lithographic methods (Goves, J. T. et al.,Micropatterning Fluid Bilayers on Solid Supports, in Science 275, 651(1997)). The patterns may also be obtained by due to the self-assemblyproperties of “colloids”, e.g., emulsions or latex particles (Larsen, A.E. and D. G. Grier, Like charge attractions in metastable colloidalcrystallites, Nature 385, 230 (1997)).

In the above methods, one, two or more different primers can be graftedonto a surface. The primers can be grafted homogeneously andsimultaneously over the surface.

Using microlithographic methods it is possible to provide immobilisedprimers in a controlled manner. If direct synthesis of oligonucleotidesonto a solid support with a free 3′-OH end is desired, thenmicro-lithographic methods can be used to simultaneously synthesise manydifferent oligonucleotide primers (Pirrung, M. C. and Bradley, J. C.Comparison of methods for photochemical phosphoramidite-basedDNA-synthesis. Journal Of Organic Chemistry 60(20):6270-6276, 1995).These may be provided in distinct areas that may correspond inconfiguration to colonies to be formed, (e.g. they may be severalnanometers or micrometers across). Within each area, only a single typeof primer oligonucleotide need be provided. Alternatively a mixturecomprising a plurality of different primers may be provided. In eithercase, primers can be homogeneously distributed within each area. Theymay be provided in the form of a regular array.

Where areas initially comprise only one type of immobilised primer theymay be modified, if desired, to carry two or more different types ofprimer. One way to achieve this is to use molecules as templates forprimer extension that have 3′ ends that hybridise with a single type ofprimer initially present and that have 5′ ends extending beyond the 3′ends of said primers. By providing a mixture of templates with differentsequences from one another, primer extension of one type of primer usingthe mixture of such templates followed by strand separation will resultin different modified primers. (The modified primers are referred toherein as “extended” primers in order to distinguish from the “primary”primers initially present on a surface).

One, two or more different types of extended primer can be provided inthis manner at any area where primary primers are initially located.Substantially equal portions of different templates can be used, ifdesired, in order to provide substantially equal proportions ofdifferent types of immobilised extended primer over a given area. Ifdifferent proportions of different immobilised extended primers aredesired, then this can be achieved by adjusting the proportions ofdifferent template molecules initially used accordingly.

A restriction endonuclease cleavage site may be located within theprimer. A primer may also be provided with a restriction endonucleaserecognition site which directs DNA cleavage several bases distant (TypeII restriction endonucleases). (For the avoidance of doubt, such sitesare deemed to be present even if the primer and its complement arerequired to be present in a double stranded molecule for recognitionand/or cleavage to occur.) Alternatively a cleavage site and/or arecognition site may be produced when a primer is extended. In anyevent, restriction endonucleases can be useful in allowing animmobilised nucleic acid molecule within a colony to be cleaved so as torelease at least a part thereof. As an alternative to using otherrestriction endonucleases, ribozymes can be used to release at leastparts of nucleic acid molecules from a surface (when such molecules areRNA molecules). Other methods are possible. For example if a covalentbond is used to link a primer to a surface this bond may be broken (e.g.by chemical, physical or enzymatic means).

Primers for use in the present invention are preferably at least fivebases long. Normally they will be less than 100 or less than 50 baseslong. However this is not essential. Naturally occurring and/ornon-naturally occurring bases may be present in the primers.

Target Nucleic Acid Molecules

Turning now to target nucleic acid molecules (also referred to herein as“templates”) for use in the method of the present invention, these canbe provided by any appropriate means. A target molecule (when insingle-stranded form) comprises a first part having a sequence that cananneal with a first primer and a second part having a sequencecomplementary to a sequence that can anneal with a second primer. In apreferred embodiment the second part has the same sequence as the secondprimer.

The second primer may have a sequence that is the same as, or differentfrom, the sequence of the first primer.

The first and second parts of the target nucleic acid molecules arepreferably located at the 3′ and at the 5′ ends respectively thereof.However this is not essential. The target molecule will usually alsocomprise a third part located between the first and second parts. Thispart of the molecule comprises a particular sequence to be replicated.It can be from any desired source and may have a known or unknown(sometimes referred to as “anonymous”) sequence. It may be derived fromrandom fractionation by mechanical means or by limited restrictionenzyme digestion of a nucleic acid sample, for example.

Further parts of the target molecules may be provided if desired. Forexample parts designed to act as tags may be provided. A “tag” isdefined by its function of enabling a particular nucleic acid molecule(or its complement) to be identified.

Whatever parts are present, target nucleic acid molecules can beprovided by techniques known to those skilled in the art of nucleic acidmanipulation. For example, two or more parts can be joined together byligation. If necessary, prior to ligation appropriate modifications canbe made to provide molecules in a form ready for ligation. For exampleif blunt end ligation is desired then a single-strand specificexonuclease such as S1 nuclease could be used to remove single strandedportions of molecules prior to ligation. Linkers and/or adapters mayalso be used in nucleic acid manipulation. (Techniques useful fornucleic acid manipulation are disclosed in Sambrook et al, MolecularCloning, 2^(nd) Edition, Cold Spring Harbor Laboratory Press (1989), forexample.)

Once a template molecule has been synthesised it can be cloned into avector and can be amplified in a suitable host before being used in thepresent invention. It may alternatively be amplified by PCR. As afurther alternative, batches of template molecules can be synthesisedchemically using automated DNA synthesisers (e.g. fromPerkin-Elmer/Applied Biosystems, Foster City, Calif.).

It is however important to note that the present invention allows largenumbers of nucleic acid molecules identical in sequence to be providedin a colony arising from a single molecule of template. Furthermore, thetemplate can be re-used to generate further colonies. Thus it is notessential to provide large numbers of template molecules to be used incolony formation.

The template can be of any desired length provided that it canparticipate in the method of the present invention. Preferably it is atleast 10, more preferably at least 20 bases long. More preferably it isat least 100 or at least 1000 bases long. As is the case for primers foruse in the present invention, templates may comprise naturally occurringand/or non-naturally occurring bases.

Reaction Conditions

Turning now to reaction conditions suitable for the method of thepresent invention, it will be appreciated that the present inventionuses repeated steps of annealing of primers to templates, primerextension and separation of extended primers from templates. These stepscan generally be performed using reagents and conditions known to thoseskilled in PCR (or reverse transcriptase plus PCR) techniques. PCRtechniques are disclosed, for example, in “PCR: Clinical Diagnostics andResearch”, published in 1992 by Springer-Verlag.

Thus a nucleic acid polymerase can be used together with a supply ofnucleoside triphosphate molecules (or other molecules that function asprecursors of nucleotides present in DNA/RNA, such as modifiednucleoside triphosphates) to extend primers in the presence of asuitable template.

Excess deoxyribonucleoside triphosphates are desirably provided.Preferred deoxyribonucleoside triphosphates are abbreviated; dTTP(deoxythymidine nucleoside triphosphate), dATP (deoxyadenosinenucleoside triphosphate), dCTP (deoxycytosine nucleoside triphosphate)and dGTP (deoxyguanosine nucleoside triphosphate). Preferredribonucleoside triphosphates are UTP, ATP, CTP and GTP. Howeveralternatives are possible. These may be naturally or non-naturallyoccurring. A buffer of the type generally used in PCR reactions may alsobe provided.

A nucleic acid polymerase used to incorporate nucleotides during primerextension is preferably stable under the pertaining reaction conditionsin order that it can be used several times. (This is particularly usefulin automated amplification procedures.) Thus, where heating is used toseparate a newly synthesised nucleic acid strand from its template, thenucleic acid polymerase is preferably heat stable at the temperatureused. Such heat stable polymerases are known to those skilled in theart. They are obtainable from thermophilic micro-organisms. They includethe DNA dependent DNA polymerase known as Taq polymerase and alsothermostable derivatives thereof. (The nucleic acid polymerase need nothowever be DNA dependent. It may be RNA dependent. Thus it may be areverse transcriptase—i.e. an RNA dependent DNA polymerase.)

Typically, annealing of a primer to its template takes place at atemperature of 25 to 90° C. Such a temperature range will normally bemaintained during primer extension. Once sufficient time has elapsed toallow annealing and also to allow a desired degree of primer extensionto occur, the temperature can be increased, if desired, to allow strandseparation. At this stage the temperature will typically be increased toa temperature of 60 to 100° C. [High temperatures can also be used toreduce non-specific priming problems prior to annealing. They can beused to control the timing of colony initiation, e.g. in order tosynchronise colony initiation for a number of samples.] Alternatively,the strands maybe separated by treatment with a solution of low salt andhigh pH (>12) or by using a chaotropic salt (e.g. guanidiniumhydrochloride) or by an organic solvent (e.g. formamide).

Following strand separation (e.g. by heating), preferably a washing stepwill be performed. The washing step can be omitted between initialrounds of annealing, primer extension and strand separation, if it isdesired to maintain the same templates in the vicinity of immobilisedprimers. This allows templates to be used several times to initiatecolony formation. (It is preferable to provide a high concentration oftemplate molecules initially so that many colonies are initiated at onestage.)

The size of colonies can be controlled, e.g. by controlling the numberof cycles of annealing, primer extension and strand separation thatoccur. Other factors which affect the size of colonies can also becontrolled. These include the number and arrangement on a surface ofimmobilised primers, the conformation of a support onto which theprimers are immobilised, the length and stiffness of template and/orprimer molecules, temperature and the ionic strength and viscosity of afluid in which the above-mentioned cycles can be performed.

Uses of Colonies

Once colonies have been formed they can be used for any desired purpose.

For example, they may be used in nucleic acid sequencing (whetherpartial or full), in diagnosis, in screening, as supports for othercomponents and/or for research purposes (preferred uses will bedescribed in greater detail later on). If desired colonies may bemodified to provide different colonies (referred to herein as “secondarycolonies” in order to distinguish from the “primary colonies” initiallyformed).

Surfaces Comprising Immobilised Nucleic Acid Strands

A surface comprising immobilised nucleic acid strands in the form ofcolonies of single stranded nucleic acid molecules is also within thescope of the present invention.

Normally each immobilised nucleic acid strand within a colony will belocated on the surface so that an immobilised and complementary nucleicacid strand thereto is located on the surface within a distance of thelength of said immobilised nucleic acid strand (i.e. within the lengthof one molecule). This allows very high densities of nucleic acidstrands and their complements to be provided in immobilised form.Preferably there will be substantially equal proportions of a givennucleic acid strand and its complement within a colony. A nucleic acidstrand and its complement will preferably be substantially homogeneouslydistributed within the colony.

It is also possible to provide a surface comprising single strandednucleic acid strands in the form of colonies, where in each colony, thesense and anti-sense single strands are provided in a form such that thetwo strands are no longer at all complementary, or simply partiallycomplementary. Such surfaces are also within the scope of the presentinvention. Normally, such surfaces are obtained after treating primarycolonies, e.g., by partial digestion by restriction enzymes or bypartial digestion after strand separation (e.g., after heating) by anenzyme which digests single stranded DNA), or by chemical or physicalmeans, (e.g., by irradiating with light colonies which have been stainedby an intercalating dye e.g., ethidium bromide).

Once single stranded colonies have been provided they can be used toprovide double stranded molecules. This can be done, for example, byproviding a suitable primer (preferably in solution) that hybridises tothe 3′ ends of single stranded immobilised molecules and then extendingthat primer using a nucleic acid polymerase and a supply of nucleosidetriphosphates (or other nucleotide precursors).

Thus surfaces comprising colonies of non-bridged double stranded nucleicacid molecules are also within the scope of the present invention. (Theterm “non-bridged” is used here to indicate that the molecules are notin the form of the bridge-like structures shown in e.g. FIG. 1 h.)

Using the present invention, small colonies can be provided that containlarge numbers of nucleic acid molecules (whether single or doublestranded). Many colonies can therefore be located on a surface having asmall area. Colony densities that can be obtained may therefore be veryhigh, as discussed supra.

Different colonies will generally be comprised of different amplifiednucleic acid strands and amplified complementary strands thereto. Thusthe present invention allows many different populations of amplifiednucleic acid molecules and their complements to be located on a singlesurface having a relatively small surface area. The surface will usuallybe planar, although this is not essential.

Apparatuses

The present invention also provides an apparatus for providing a surfacecomprising colonies of the immobilised nucleic acid molecules discussedsupra.

Such an apparatus can include one or more of the following:

-   a) means for immobilising primers on a surface (although this is not    needed if immobilised primers are already provided);-   b) a supply of a nucleic acid polymerase;-   c) a supply of precursors of the nucleotides to be incorporated into    a nucleic acid (e.g. a supply of nucleoside triphosphates);-   d) means for separating annealed nucleic acids (e.g. heating means);    and-   e) control means for co-ordinating the different steps required for    the method of the present invention.

Other apparatuses are within the scope of the present invention. Theseallow immobilised nucleic acids produced via the method of the presentinvention to be analysed. They can include a source of reactants anddetecting means for detecting a signal that may be generated once one ormore reactants have been applied to the immobilised nucleic acidmolecules. They may also be provided with a surface comprisingimmobilised nucleic acid molecules in the form of colonies, as describedsupra.

Desirably the means for detecting a signal has sufficient resolution toenable it to distinguish between signals generated from differentcolonies.

Apparatuses of the present invention (of whatever nature) are preferablyprovided in automated form so that once they are activated, individualprocess steps can be repeated automatically.

The present invention will now be described without limitation thereofin sections A to I below with reference to the accompanying drawings.

It should be appreciated that procedures using DNA molecules referred toin these sections are applicable mutatis mutandis to RNA molecules,unless the context indicates otherwise.

It should also be appreciated that where sequences are provided in thefollowing description, these are written from 5′ to 3′ (going from leftto right), unless the context indicates otherwise.

The figures provided are summarised below:

FIGS. 1A and 1B illustrate a method for the simultaneous amplificationand immobilisation of nucleic acid molecules using a single type ofprimer.

FIG. 2 illustrates how colony growth using a method of the presentinvention can occur.

FIG. 3 illustrates the principle of the method used to produce DNAcolonies using the present invention. It also illustrates the annealing,elongation and denaturing steps that are used to provide such colonies.

FIG. 4 is an example of DNA colonies formed by amplification of aspecific template with single primers grafted onto a surface.

FIG. 5 is an example of DNA colonies formed by amplification of aspecific template with single primers grafter onto a surface.

FIGS. 6A and 6B illustrate a method for the simultaneous amplificationand immobilisation of nucleic acid molecules using two types of primer.

FIG. 7 shows actual DNA colonies produced via the present invention.

FIG. 8 shows actual DNA colonies produced via the present invention.

FIGS. 9A and 9B illustrate a method of the simultaneous amplificationand immobilisation of nucleic acid molecules when a target molecule isused as a template having internal sequences that anneal with primers.

FIGS. 10A and 10B illustrate a method to synthesise additional copies ofthe original nucleic acid strands using nucleic acid strands present incolonies. The newly synthesised strands are shown in solution but can beprovided in immobilised form if desired.

FIGS. 11A and 11B show the PCR amplification of DNA from DNA found inthe pre-formed DNA colonies.

FIGS. 12A-C illustrate how secondary primers can be generated from DNAcolonies.

FIGS. 13A and 13B illustrate how secondary DNA colonies can be generatedfrom secondary primers.

FIGS. 14A and 14B illustrate how primers with different sequences can begenerated from a surface functionalised with existing primers.

FIG. 15 depicts methods of preparing DNA fragments suitable forgenerating DNA colonies.

FIG. 16 illustrates a method for synthesising cRNA using the DNA colonyas a substrate for RNA polymerase.

FIG. 17 illustrates a preferable method to determine the DNA sequence ofDNA present in individual colonies.

FIG. 18 illustrates a method of determining the sequence of a DNAcolony, de novo.

FIG. 19 illustrates the utility of secondary DNA colonies in the assayof mRNA expression levels.

FIGS. 20 and 21 illustrates the use of the secondary DNA colonies in theisolation and identification of novel and rare expressed genes.

A. Scheme Showing the Simultaneous Amplification and Immobilisation ofNucleic Acid Molecules Using a Single Type of Primer

Referring now to FIG. 1 a), a surface is provided having attachedthereto a plurality of primers (only one primer is shown forsimplicity). Each primer (1) is attached to the surface by a linkageindicated by a dark block. This may be a covalent or a non-covalentlinkage but should be sufficiently strong to keep a primer in place onthe surface. The primers are shown having a short nucleotide sequence(5′-ATT). In practice however longer sequences would generally beprovided.

FIG. 1 b) shows a target molecule (II) that has annealed to a primer.The target molecule comprises at its 3′ end a sequence (5′-ATT) that iscomplementary to the primer sequence (5′-ATT). At its 5′ end the targetmolecule comprises a sequence (5′-ATT) that is the same as the primersequence (although exact identity is not required).

Between the two ends any sequence to be amplified (or the complement ofany sequence to be amplified) can be provided. By way of example, partof the sequence to be amplified has been shown as 5′-CCG.

In FIG. 1 c) primer extension is shown. Here a DNA polymerase is usedtogether with dATP, dTTP, dGTP and dCTP to extend the primer (5′-ATT)from its 3′ end, using the target molecule as a template.

When primer extension is complete, as shown in FIG. 1 d), it can be seenthat an extended immobilised strand (III) is provided that iscomplementary to the target molecule. The target molecule can then beseparated from the extended immobilised strand (e.g. by heating, asshown in FIG. 1)). This separation step frees the extended, immobilisedstrand so that it can then be used to initiate a subsequent round ofprimer extension, as shown in FIGS. 1 f) and 1 g) Here the extended,immobilised strand bends over so that one end of that strand (having theterminal sequence 5′-AAT) anneals with another primer (2,5′-ATT), asshown in FIG. 1 f). That primer provides a 3′ end from which primerextension can occur, this time using the extended, immobilised strand asa template. Primer extension is shown occurring in FIG. 1 g) and isshown completed in FIG. 1 h).

FIG. 1 i) shows the two extended immobilised strands that were shown inFIG. 1 h) after separation from one another (e.g. by heating). Each ofthese strands can then themselves be used as templates in further roundsof primer extension initiated from new primers (3 and 4), as shown inFIGS. 1 j) and 1 k). Four single stranded, immobilised strands can beprovided after two rounds of amplification followed by a strandseparation step (e.g. by heating), as shown in FIG. 1 l). Two of thesehave sequences corresponding to the sequence of the target moleculeoriginally used as a template. The other two have sequencescomplementary to the sequence of the target molecule originally used asa template. (In practice a given immobilised strand and its immobilisedcomplement may anneal once.)

It will therefore be appreciated that a given sequence and itscomplement can be provided in equal numbers in immobilised form and canbe substantially homogeneously distributed within a colony.

Further rounds of amplification beyond those shown in FIG. 1 can ofcourse be performed so that colonies comprising large numbers of a givensingle stranded nucleic acid molecule and a complementary strand theretocan be provided. Only a single template need be used to initiate eachcolony, although, if desired, a template can be reused to initiateseveral colonies.

It will be appreciated that the present invention allows very highdensities of immobilised extended nucleic acid molecules to be provided.Within a colony each extended immobilised molecule will be located at asurface within one molecule length of another extended immobilisedmolecule. Thus position 3 shown in FIG. 1 l) is within one moleculelength of position 1; position 1 is within one molecule length ofposition 2; and position 2 is within one molecule length of position 4.

FIG. 2 is provided to illustrate how colony growth can occur (using themethod described with reference to FIG. 1 and to FIG. 6 or any othermethod of the present invention for providing immobilised nucleic acidmolecules).

A flat plate is shown schematically in plan view having primersimmobilised thereon in a square grid pattern (the primers are indicatedby small dots). A regular grid is used solely for simplicity: in manyreal cases, the positions of the primers might indeed be less ordered orrandom.

At the position indicated by arrow X a template molecule has annealed toa primer and an initial bout of primer extension has occurred to providean immobilised, extended nucleic acid strand. Following strandseparation, an end of that strand becomes free to anneal to furtherprimers so that additional immobilised, extended nucleic acid strandscan be produced. This is shown having occurred sequentially at positionsindicated by the letter Y. For simplicity, the primer chosen forannealing is positioned next to the primer carrying the nucleic acidstrand: in real cases, the nucleic acid strand could anneal with aprimer which is not its next nearest neighbour. However, this primerwill obviously be within a distance equal to the length of the nucleicacid strand.

It will be appreciated that annealing at only one (rather than at all)of these positions is required for colony cell growth to occur.

After immobilised, extended, single-stranded nucleic acid molecules havebeen provided at the positions indicated by letter Y, the resultantmolecules can themselves anneal to other primers and the process can becontinued to provide a colony comprising a large number of immobilisednucleic acid molecules in a relatively small area.

FIG. 3 shows a simplified version of the annealing, elongation anddenaturing cycle. It also depicts the typical observations that can bemade, as can be seen on the examples shown in FIGS. 4 and 6. Thesimultaneous amplification and immobilisation of nucleic acids usingsolid phase primers has been successfully achieved using the proceduredescribed in Examples 1, 2 and 3 below:

Example 1

Oligonucleotides, phosphorylated at their 5′-termini (Microsynth GmbH,Switzerland), were grafted onto Nucleolink plastic microtitre wells(Nunc, Roskilde, Denmark). The sequence of the oligonucleotide p57corresponds to the sequence 5′-TTTTTTCACCAACCCAAACCAACCCAAACC (SEQ IDNO:1) and p58 corresponds to the sequence5′-TTTTTTAGAAGGAGAAGGAAAGGGAAAGGG (SEQ ID NO:2). Microtitre wells withp57 or p58 were prepared as follows. In each Nucleolink well, 30 μl of a160 nM solution of the oligonucleotide in 10 mM 1-methyl-imidazole (pH7.0) (Sigma Chemicals, St. Louis, Mo.) was added. To each well, 10 μl of40 mM 1-ethyl-3-(3-dimethylaminopropyl)-carbodiimide (pH 7.0) (SigmaChemicals) in 10 mM 1-methyl-imidazole, was added to the solution ofoligonucleotides. The wells were then sealed and incubated at 50° C.overnight. After the incubation, wells were rinsed twice with 200 μl ofRS (0.4N NaOH, 0.25% Tween 20 (Fluka Chemicals, Switzerland)), incubated15 minutes with 200 μl RS, washed twice with 200 μl RS and twice with200 μl TNT (100 roM TrisHCl pH7.5, 150 mM NaCl, 0.1% Tween 20). Tubeswere dried at 50° C. and were stored in a sealed plastic bag at 4° C.

Colony generation was initiated in each well with 15 μl of priming mix;1 nanogram template DNA (where the template DNA began with the sequence5′-AGAAGGAGAAGGAAAGGGAAAGGG (SEQ ID NO: 3) and terminated at the 3′-endwith the sequence CCCTTTCCCTTTCCTTCTCCTTCT-3′ (SEQ ID NO:4), the fourdNTPs (0.2 mM), 0.1% BSA (bovine serum albumin, Boehringer-Mannheim,Germany), 0.1% Tween 20, 8% DMSO (dimethylsulfoxide, Fluka Chemicals,Switzerland), 1× Amplitaq PCR buffer and 0.025 units/μl of AmpliTaq DNApolymerase (Perkin Elmer, Foster City, Calif.). The priming reaction wasa single round of PCR under the following conditions; 94° C. for 4minutes, 60° C. for 30 seconds and 72° C. for 45 seconds in athermocycler (PTC 200, MJ Research, Watertown, Mass.). Then 100 μl TEbuffer (10 mM trisHCl, pH 7.5, 1 mM EDTA) was used in three successiveone minute long washes at 94° C. The DNA colonies were then formed byadding to each well, 20 μl of polymerisation mix, which was identical tothe priming mix but lacking the template DNA. The wells were then placedin the PTC 200 thermocycler and colony growing was performed byincubating the sealed wells 4 minutes at 94° C. and cycling for 50repetitions the following conditions: 94° C. for 45 seconds, 65° C. for2 minutes, 72° C. for 45 seconds. After completion of this program, thewells were kept at 8° C. until further use.

A 640 base pair fragment corresponding to the central sequence of thetemplate (but not including the 5′-AGAAGGAGAAGGAAAGGGAAAGGG (SEQ ID NO:3) sequence) was amplified PCR. The isolated fragment was labeled withbiotin-N⁴-dCTP (NEN Life Sciences, Boston, Mass.) and a trace of[α-³²p]dCTP (Amersham, UK) using the Prime-it II labeling kit(Stratagene, San Diego, Calif.) to generate a biotinylated probe.

The biotinylated probe was diluted in to a concentration of 2.5 nM inEasyHyb (Boehringer-Mannheim, Germany) and 15 μl was hybridized to eachsample with the following temperature scheme (PTC 200 thermocycler): 94°C. for 5 minutes, followed by 500 steps of 0.1° C. decrease intemperature every 12 seconds (in other words, the temperature isdecreased down to 45° C. in 100 minutes). The samples are then washed asfollows; 1 time with 2×SSC/0.1% SDS (2×SSC; 0.3M NaCl/0.03M sodiumcitrate pH7.0/0.001 mg/ml sodium dodecyl sulfate) at room temperature,once with 2×SSC/0.1% SDS at 37° C. and once with 0.2×SSC/0.1% SDS at 50°C. The wells are then incubated for 30 minutes with 50 μl of redfluorescent, Neutravidin-coated, 40 nm FluoSpheres® (580 nm excitationand 605 nm emission, Molecular Probes Inc., Eugene, Oreg.) in TNT/0.1%BSA. (The solution of microspheres is made from a dilution of 2 μl ofthe stock solution of microspheres into 1 ml of TNT/0.1% BSA, which isthen sonicated for 5 minutes in a 50 W ultra-sound water-bath(Elgasonic, Switzerland), followed by filtration through a 0.22 μmfilter (Millex GV4). The wells are then counted (Cherenkov) on aMicrobeta plate scintillation counted (WALLAC, Turku, Finland).

Excess FluoSpheres® are removed by washing for min in TNT/0.1% BSA atroom temperature. Images of the stained samples are observed using a 20×objective on an inverted microscope (Axiovert S100TV, Carl Zeiss AG,Oberkochen, Germany) equipped with a Micromax 512×768 CCD camera(Princeton instruments, Trenton, N.J.) through a XF43 filter set(PB546/FT580/LP590, Omega Optical, Brattleboro, Vt.) with a 5 secondexposure.

FIGS. 4 and 5 show the hybridisation results for colony generation ontubes functionalised with either: FIG. 4 oligonucleotide p57 or FIG. 5oligonucleotide pS8. The control reaction shows very few fluorescentspots, since the sequence of the flanking regions on the template do notcorrespond to the primer sequences grafted onto the well. In contrast,FIG. 5 shows the number of fluorescent spots detected when the primersgrafted to the wells match the flanking sequences on the initiating DNAtemplate. Calculating the number of fluorescent spots detected andtaking into consideration the magnification, we can estimate that thereare between 3 and 5×10⁷ colonies/cm². The photos are generated by theprogram, Winview 1.6.2 (Princeton Instruments, Trenton, N.J.) withbackgrounds and intensities normalised to the same values.

B) Scheme Showing the Simultaneous Amplification and Immobilisation ofNucleic Acid Molecules Using Two Different Types of Primer

Referring now to FIGS. 6A and 6B, another embodiment of the presentinvention is illustrated. Here two different immobilised primers areused to provide primer extension.

In this embodiment the target molecule shown is provided with anucleotide sequence at its 3′ end (AAT-3′) that is complementary to thesequence of a first primer, (5′-ATT, I), which is grafted on thesurface, so that annealing with that primer can occur. The sequence(5′-GGT) at the 5′ end of the target molecule, III, corresponds to thesequence (5′-GGT) of a second primer, II, which is also grafted to thesurface, so that the sequence which is complementary to the sequence atthe 5′ end can anneal with that said second primer. Generally saidcomplementary sequence (5′-ACC) is chosen so that it will not annealwith the first primer (5′-ATT). Unlike the situation described insection A, once the 3′ end of a newly synthesised strand anneals to aprimer on the surface, it will have to find a primer whose sequence isdifferent from the sequence it carries at its 5′ end (see the differencebetween FIGS. 1A, (f) and 6A (f)).

The embodiment shown in FIGS. 6A and 6B have an advantage over theembodiment illustrated in FIG. 1 since the possibility of one end of asingle stranded target molecule annealing with another end of the samemolecule in solution can be avoided and therefore amplification canproceed further. The possibility of annealing occurring between bothends of an immobilised complement to a target molecule can also beavoided.

Example 2

A mix of two oligonucleotides which are phosphorylated at the 5′-end(Microsynth GmbH, Balgach, Switzerland) have been grafted on 96 wellNucleolink plates (Nunc, Denmark) as recommended by the manufacturer.The resulting plates has been stored dry at 4° C. The sequence of theprimer, P1, was 5′-GCGCGTAATACGACTCACTA (SEQ ID NO:5), the sequence ofthe other primer, P2, was 5′-CGCAATTAACCCTCACTAAA (SEQ ID NO 6). Theseplates are specially formulated by Nunc, allowing the covalent graftingof 5′ phosphorylated DNA fragments through a standard procedure.

A template has been cloned in a vector (pBlueScript Skminus, StratageneInc, San Diego, Calif.) with the appropriate DNA sequence at the cloningsite (i.e., corresponding to P1 and P2 at position 621 and 794respectively), and 174 by long linear double stranded DNA template hasbeen obtained by PCR amplification, using P1 and P2. The template PCRproduct has been purified on Qiagen Qia-quick columns (Qiagen GmbH,Hilden, Germany) in order to remove the nucleotides and the primers usedduring the PCR amplification.

The purified template (in 50 μl solution containing 1×PCR buffer (PerkinElmer, Foster City, Calif.) with the four deoxyribonucleosidetriphosphates (dNTPs) at 0.2 mM, (Pharmacia, Uppsala, Sweden) and 2.5units of AmpliTaq Gold DNA polymerase (Perkin Elmer, Foster City,Calif.)} has been spread on the support, i.e. on the Nucleolink platesgrafted with P1 and P2 (the plates have been rinsed with a solutioncontaining 100 mM TRIS-HCl (pH 7.5), 150 mM NaCl and 0.1% Tween 20(Fluka, Switzerland) at room temperature for 15 min}. This solution hasbeen incubated at 93° C. for 9 minutes to activate the DNA polymeraseand then 60 cycles (94° C./30 sec., 48° C./30 sec., 72° C./30 sec.) havebeen performed on a PTC 200 thermocycler. Several differentconcentrations of PCR template have been tested (approximately 1, 0.5,0.25, 0.125, 0.0625 ng/μl) and for each sample a control reactioncarried out without Taq polymerase has been performed (same conditionsas above but without DNA polymerase).

Each sample has been stained with YO-PRO (Molecular Probes, PortlandOreg.), a highly sensitive stain for double stranded DNA. The resultingproducts have been observed on a confocal microscope using a 40×objective (LSM 410, Carl Zeiss AG, Oberkochen, Germany) with appropriateexcitation (an 488 argon laser) and detection filters (510 low passfilter) (note: the bottom of each well is flat and allows observationwith an inverted fluorescence microscope).

In FIG. 7, the control well (without added DNA template, panel a) showsonly rare objects which can be observed on a blank surface these objectswere useful at this stage for reporting that the focus was correct).These objects have an irregular shape, are 20 to 100 micro-meters insize and have a thickness much larger than the field depth of theobservation. In a well where DNA polymerase was present (FIG. 7, panelii), in addition to the objects of irregular shape observed in thecontrol well, a great number of fluorescent spots can be observed. Theypresent a circular shape, they are 1 to 5 micro meters in size and donot span the field of view. The number of spots depends on theconcentration of the template used for initiating colony formation. Fromthe observed size of the colonies, one can estimate that more than10,000 distinct colonies can be arrayed within 1 mm² of support.

Example 3

Oligonucleotides (Microsynth GmbH, Switzerland) were grafted ontoNucleolink wells (Nunc, Denmark). Oligonucleotide PI corresponds to thesequence 5′-TTTTTTCTCACTATAGGGCGAATTGG (SEQ ID NO:7) and oligonucleotideP2 corresponds to 5′-TTTTTTCTCACTAAAGGGAACAAAAGCTGG (SEQ ID NO:8). Ineach Nucleolink well, a 45 μl of 10 mM 1-methyl-imidazole (pH 7.0)(Sigma Chemicals, St. Louis, Mo.) solution containing 360 fmol of P1 and360 fmol of P2 was added. To each well, 15 μl of 40 mM1-ethyl-3-(3dimethylaminopropyl)-carbodiimide (pH 7.0) (Sigma Chemicals)in 10 mM 1-methyl-imidazole, was added to the solution ofoligonucleotides. The wells were then sealed and incubated at 50° C. for16 hours. After the incubation, wells have been rinsed twice with 200 μlof RS (0.4N NaOH, 0.25% Tween 20), incubated 15 minutes with 200 μl RS,washed twice with 200 μl RS, and twice with 200 μl TNT (100 mM Tris/HClpH7.5, 150 mM NaCl, 0.1% Tween 20), before they are put to dry at 50° C.in an oven. The dried tubes were stored in a sealed plastic bag at 4° C.

Colony growing was initiated in each well with 15 μl of initiation mix(1×PCR buffer, 0.2 mM dNTPs and 0.75 units of AmpliTaq Gold DNApolymerase, 20 nanograms of template DNA, where the template DNA waseither S1 DNA or S2 DNA or a mixture of different ratios of S1 DNA andS2 DNA, as indicated in discussion to FIG. 6B. S1 and 52 are 704 basepair and 658 bp fragments, respectively, which have been cloned intopBlueScript Skminus plasmids and subsequently amplified through a PCRusing P1 and P2 as primers. The fragments were purified on QiagenQia-quick columns (QIAGEN GmbH, Germany) in order to remove thenucleotides and the primers.

Each well was sealed with Cycleseal™ (Robbins Scientific Corp.,Sunnyvale, Calif.), and incubated at 93° C. for 9 minutes, 65° C. for 5minutes and 72° C. for 2 minutes and back to 93° C. Then 200 μl TNTsolution was used in three successive one minute long washes at 93° C.The initiation mix was then replaced by 15 μl growing mix (same asinitiation mix, but without template DNA), and growing was performed byincubating the sealed wells 9 minutes at 93° C. and repeating 40 timesthe following conditions: 93° C. for 45 seconds, 65° C. for 3 minutes,72° C. for 2 minutes. After completion of this program, the wells werekept at 6° C. until further use. The temperature control was performedin a PTC 200 thermo-cycler, using the silicon pad provided in theNucleolink kit and the heated (104° C.) lid of the PTC 200.

A 640base pair fragment corresponding to the central sequence of the S1fragment, but not including the P1 or P2 sequence was amplified by PCRas previously described. The probe was labelled with biotin-16-dUTP(Boehringer-Mannheim, Germany) using the Prime-it II random primerlabelling kit (Stratagene, San Diego, Calif.) according to themanufacturers instructions.

The biotinylated probes were hybridized to the samples in EasyHyb buffer(Boehringer-Mannheim, Germany), using the following temperature scheme(in the PTC 200 thermocycler): 94° C. for 5 minutes, followed by 68steps of 0.5° C. decrease in temperature every 30 seconds (in otherwords, the temperature is decreased down to 60° C. in 34 minutes), usingsealed wells. The samples are then washed 3 times with 200 μl of TNT atroom temperature. The wells are then incubated for 30 minutes with 50 μlTNT containing 0.1 mg/ml BSA. Then the wells are incubated 5 minuteswith 15 μl of solution of red fluorescent, Neutravidin-coated, 40 nmFluoSpheres® (580 nm excitation and 605 nm emission, Molecular Probes,Portland, Oreg.). The solution of microspheres is made of 2 μl of thestock solution of microspheres, which have been sonicated for 5 minutesin a 50 W ultra-sound water-bath (Elgasonic, Bienne, Switzerland),diluted in 1 ml of TNT solution containing 0.1 mg/ml BSA and filteredwith Millex GV4 0.22 μm pore size filter (Millipore, Bedford, Mass.).

The stained samples are observed using an inverted Axiovert 10microscope using a 20× objective (Carl Zeiss AG, Oberkochen, Germany)equipped with a Micromax 512×768 CCD camera (Princeton Instruments,Trenton, N.J.), using a XF43 filter set (PB546/FT580/LP590, OmegaOptical, Brattleboro, Vt.), and 10 seconds of light collection. Thefiles are converted to TIFF format and processed in the suitablesoftware (PhotoPaint, Corel Corp., Ottawa, Canada). The processingconsisted in inversion and linear contrast enhancement, in order toprovide a picture suitable for black and white print-out on a laserprinter.

FIG. 8 shows the results for 3 different ratios of the S1/S2 templatesused in the initiating reaction: i) the S1/S2 is 1/0, many spots can beobserved, ii) the S1/S2 is 1/10, and the number of spots isapproximately 1/10 of the number of spots which can be observed in thei) image, as expected, and iii) the S1/S2 is 0/1, and only a few rarespots can be seen.

C. Scheme Showing Simultaneous Amplification and Immobilisation ofNucleic Acid Molecules when the Target Molecule Contains InternalSequences Complementary to the Immobilised Primers

FIGS. 9A and 9B are provided to show that the sequences shown at the 5′and 3′ ends of the target molecule illustrated in FIGS. 1A, 1B, 6A, and6B need not be located at the ends of a target molecule.

A target nucleic acid molecule (II) may have a sequence at each (oreither) end that is neither involved in annealing with a primer nor inacting as a template to provide a complementary sequence that annealswith a primer (sequence 5′-AAA and sequence S′-CCC). One of the internalsequences (5′-AAT) is used as a template to synthesise a complementarysequence, III, thereto (5′-TTT), as is clear from FIG. 9A, (a) to (e).

The sequence 5′-TTT is not however itself used to provide a sequencecomplementary thereto, as is clear from FIGS. 9A (f) to (h) and FIG. 9B(i) to (k). It can be seen from FIG. 9B, (l) that only one of the fourimmobilised strands shown after two rounds of primer extension and astrand separation step comprises the additional sequence 5′-TTT and thatno strand comprising a complementary sequence (5′-AAA) to this sequenceis present (i.e. only one strand significantly larger than the others ispresent). After several rounds of amplification the strand comprisingthe sequence 5′-TTT will represent an insignificant proportion of thetotal number of extended, immobilised nucleic acid molecules present.

D. Using Nucleic Acid Strands Present in Colonies to SynthesiseAdditional Copies of Nucleic Acid Strands

Amplified, single stranded nucleic acid molecules present in coloniesprovided by the present invention can themselves be used as templates tosynthesise additional nucleic acid strands.

FIGS. 10A and 10B illustrate one method of synthesising additionalnucleic acids using immobilised nucleic acids as a starting point.

Colonies will usually comprise both a given nucleic acid strand and itscomplement in immobilised form (FIG. 10A, (a)). Thus they can be used toprovide additional copies not only of a given nucleic acid strand butalso of its complement.

One way of doing this is to provide one or more primers (primers TTA andTGG) in solution that anneal to amplified, immobilised nucleic acidstrands present in colonies (FIG. 10A, (c)) provided by the presentinvention. (These primers may be the same as primers initially used toprovide the immobilised colonies, apart from being provided in freerather than immobilised form.) The original DNA colony is denatured byheat to it single-stranded form (FIG. 10A, (b)), allowing primers TTAand TGG to anneal to the available 3′ end of each DNA strand. Primerextension, using AmpliTaq DNA polymerase and the fourdeoxyribonucleoside triphosphates (labeled or unlabeled) can then beused to synthesise complementary strands to immobilised nucleic acidstrands or at least to parts thereof (step (iii)).

Once newly formed strands (FIG. 10B, (d)) have been synthesised by theprocess described above, they can be separated from the immobilisedstrands to which they are hybridised (e.g. by heating). The process canthen be repeated if desired using the PCR reaction, to provide largenumber of such strands in solution (FIG. 10B, (e)).

Strands synthesised in this manner, after separation from theimmobilised strands, can, if desired, be annealed to one another (i.e. agiven strand and its complement can anneal) to provide double-strandednucleic acid molecules in solution. Alternatively they can be separatedfrom one another to provide homogenous populations of single-strandednucleic acid molecules in solution.

It should also be noted that once single-stranded molecules are providedin solution they can be used as templates for PCR (or reverse PCR).Therefore it is not essential to continue to use the immobilised nucleicacid strands to obtain further amplification of given strands orcomplementary strands thereto.

It should be noted that where a plurality of colonies are provided andnucleic acid strands in different colonies have different sequences, itis possible to select only certain colonies for use as templates in thesynthesis of additional nucleic acid molecules. This can be done byusing primers for primer extension that are specific for moleculespresent in selected colonies.

Alternatively primers can be provided to allow several or all of thecolonies to be used as templates. Such primers may be a mixture of manydifferent primers (e.g. a mixture of all of the primers originally usedto provide all of the colonies, but with the primers being provided insolution rather than in immobilised form).

Example 4

Oligonucleotides (Microsynth GmbH Balgach, Switzerland) were graftedonto Nucleolink wells (Nunc, Denmark). Oligonucleotide P1 corresponds tothe sequence 5′-TTTTTTTTTTCACCAACCCAAACCAACCCAAACC (SEQ ID NO:9) andoligonucleotide P2 corresponds to 5′-TTTTTTTTTTAGAAGGAGAAGGAAAGGGAAAGGG(SEQ ID NO:10). In each Nucleolink well, a 45 μl of 10 mM1-methyl-imidazole (pH 7.0) (Sigma Chemicals) solution containing 360fmol of P1 and 360 fmol of P2 was added. To each well, 15 μl of 40 mM1-ethyl-3-(3-dimethylaminopropyl)-carbodiimide (pH 7.0) (SigmaChemicals) in 10 mM 1-methyl-imidazole, was added to the solution ofoligonucleotides. The wells were then sealed and incubated at 50° C. for16 hours. After the incubation, wells have been rinsed twice with 200 μlof RS (0.4N NaOH, 0.25% Tween 20), incubated 15 minutes with 200 μl RS,washed twice with 200 μl RS, and twice with 200 μl TNT (100 mM Tris/HClpH7.5, 1S0 mM NaCl, 0.1% Tween 20), before they are put to dry at 50° C.in an oven. The dried tubes were stored in a sealed plastic bag at 4° C.

Colony growing was initiated in each well with 15 μl of initiation mix(1×PCR buffer, 0.2 mM dNTPs and 0.75 units of AmpliTaq DNA polymerase,20 nanograms of template DNA, where the template DNA was either S1 DNAor S2 DNA or a 1/1 mixture of S1 DNA and S2 DNA, as indicated indiscussion to Example 3. S1 and S2 are 658 base pair and 704 b.p.fragments, respectively, which have been prepared as described inEXAMPLE 3.

Each well was sealed with Cycleseal™ (Robbins Scientific Corp.,Sunnyvale, Calif.) and incubated at 93° C. for 9 minutes, 65° C. for 5minutes and 72° C. for 2 minutes and back to 93° C. Then 200 μl TNTsolution was used in three successive one minute long washes at 93° C.The initiation mix was then replaced by 15 μl growing mix (same asinitiation mix, but without template DNA) and growing was performed byincubating the sealed wells 9 minutes at 93° C. and repeating 40 timesthe following conditions: 93° C. for 45 seconds, 65° C. for 3 minutes,72° C. for 2 minutes. After completion of this program, the wells werekept at 6° C. until further use. The temperature control was performedin a PTC 200 thermo-cycler.

Different treatments where applied to 6 sets (A, B, C, D, E and F) of 3wells (1, 2, 3), one prepared with template S1, one with template S1 andtemplate S2 and one prepared with template S2 alone (yielding A1, A2,A3, . . . , F1, F2, F3). The set A was left untreated, set B has beenincubated for 10 minutes with BAL-31 exonuclease (New England Biolabs,Beverly, Mass.) at 37° C. in BAL-31 buffer (BAL-31 essentially digestsdouble stranded DNA which has both ends free), set C has been incubatedfor 10 minutes with S1 nuclease (Pharmacia, Uppsala, Sweden) at 37° C.in S1-buffer (S1 nuclease essentially digests single stranded DNA), setD, E and F have been incubated with both BAL-31 and S1 nucleasesReactions were stopped by rinsing the wells with TNT buffer.

PCR (25 cycles, 30 sec. at 94° C., 45 sec. at 60° C., 45 sec. at 72° C.)has been performed in the Nucleolink wells with 0.25 μM primers P70(5′-CACCAACCCAAACCAACCCAAACCACGACTCACTATAGGGCGAA (SEQ ID NO:11)) and P71(5′AGAAGGAGAAGGAAAGGGAAAGGGTAAAGGGAACAAAAGCTGGA (SEQ ID NO:12)) insolution in sets A, B, C and D. P70 and P71 are suited for theamplification of both S1 and S2, since primer P70 contains the sequenceof primer P1 and p71 contains P2. In the set E wells, PCR has beenperformed with a set of forward (P150, 5′-GGTGCTGGTCCTCAGTCTGT (SEQ IDNO:13)) and reverse (P151, 5′-CCCGCTTACCAGTTTCCATT (SEQ ID NO:14))primers which are within S1 and not within S2 so as to produce a 321 byPCR product, and in the set F wells, PCR has been performed with a setof forward (P152, 5′-CTGGCCTTATCCCTAACAGC (SEQ ID NO:15)) and reverse(P153, 5′-CGATCTTGGCTCATCACAAT (SEQ ID NO:16)) primers which are withinS2 and not within S1 so as to produce a 390 by PCR product. For each ofthe 18 PCR reactions, 3 μl of solution have been used for gelelectrophoresis on 1% agarose in presence of 0.1 μg/ml ethidium-bromide.The pictures of the gels are presented in FIGS. 11A and 11B. Thesepictures show that DNA in the colonies is protected from exonucleasedigestion (sets B, C and D as compared to set A), and that both S1 andS2 can be recovered either simultaneously using P1 and P2 (sets A, B, Cand D) or specifically (set E and F). In set E and F, where the shorterPCR products are more efficiently amplified than the longer PCR productsin sets A, B, C, D, a cross-contamination between the S1 and S2templates is detectable (see lane E2 and F1).

E. Provision of Secondary Colonies

It is also possible to modify initially formed colonies to providedifferent colonies (i.e. to provide colonies comprising immobilisednucleic acid molecules with different sequences from those moleculespresent in the initially formed colonies). Here, the initially formedcolonies are referred to as “primary colonies” and the later formedcolonies as “secondary colonies”. A preliminary procedure is necessaryto turn the primary colonies into “secondary primers” which will besuitable for secondary colony generation.

FIGS. 12A, 12B, and 12C shows how ‘secondary primers’ are generatedusing existing primary colonies. As a starting point, the primary colony(FIG. 12A, (a)) is left in the fully hybridised, double-stranded form. Asingle-strand specific DNA exonuclease, might be used to remove allprimers which have not been elongated. One could also choose to cap allfree 3′-OH ends of primers with dideoxyribonucleotide triphosphatesusing a DNA terminal transferase (step (i), FIG. 12A, (b)).

Secondly and independently, the DNA molecules forming the colonies canbe cleaved by using endonucleases. For example, a restriction enzymethat recognises a specific site within the colony (depicted by the ‘RE’arrow in FIG. 12B, (c)) and cleaves the DNA colony (step (ii), FIG.12B). If desired, the enzymatically cleaved colony (FIG. 12B, (d)) canthen be partially digested with a 3′ to 5′ double-strand specificexonuclease (e.g. E. coli exonuclease III, depicted by ‘N’, step (iii),FIG. 12B, (d)). In any case, the secondary primers are available afterdenaturation (e.g., by heat) and washing (FIG. 12B, (e)).

Alternatively, the double stranded DNA forming the colonies (FIG. 12C,(f)) can be digested with the double-strand specific 3′-5′ exonuclease,which digest only one strand of double stranded DNA. An important caseis when the exonuclease digests only a few bases of the DNA moleculebefore being released in solution, and when digestion can proceed whenanother enzyme binds to the DNA molecule (FIG. 12C, (g)). In this casethe exonuclease digestion will proceed until there remain only singlestranded molecules which, on average, are half the length of thestarting material, and are without any complementary parts (which couldform partial duplexes) remaining in the single stranded molecules in acolony (FIG. 12C, (h)).

In all cases, these treatments result in single-stranded fragmentsgrafted onto a support which correspond to the sequence of the originaltemplate and that can be used for new DNA colony growing if anappropriate new template is provided for colony initiation (FIG. 12B,(e) and FIG. 12C, (h)).

The result of such a treatment, thus a support holding secondaryprimers, will be referred to as a “support for secondary colonygrowing”. Templates useful for secondary colony growing may includemolecules having known sequences (or complements of such sequences).Alternatively templates may be derived from unsequenced molecules (e.g.random fragments). In either event the templates should be provided withone or more regions for annealing with nucleic acid strands present inthe primary colonies.

FIGS. 13A and 13B show how a secondary colony can be generated when anappropriate template (TP, FIG. 13A, (a)) is provided for a second roundof DNA colony generation on a support for secondary colony growing,holding secondary primers. In this example, treatment of the primarycolony as described above has generated the secondary primers, SP1 andSP2 (FIG. 13A, (a)). The template TP, will hybridise to itscomplementary secondary primer, SP1, and following an extension reactionusing a DNA polymerase as described, will be extended as depicted (FIG.13A, (b)). Following a denaturing (step ii), reannealing (step iii) andDNA polymerase (step iv) cycle, a replica of the original primary colonywill be formed (FIG. 13B, (e)).

The maximum size of a secondary colony provided by this embodiment ofthe present invention is restricted by the size of the primary colonyonto which it grows. Several secondary growing processes can be usedsequentially to create colonies for specific applications (i.e. a firstcolony can be replaced with a second colony, the second colony can bereplaced with a third colony, etc.)

F. Provision of Extended Primers

FIGS. 14A and 14B shows how extended primers can be generated on anarray of oligonucleotides. The same procedure could be applied to asupport covered with colonies or secondary primers as described insection E.

In FIG. 14A, (a) a support is provided having a plurality of immobilisedprimers shown thereon. Different immobilised primers are shown presentin different regions of the support (represented by squares). Primershaving the sequence 5′-AAA are present in one square and primers havingthe sequence 5′-GGG are present in another square.

FIG. 14A, (b) and FIG. 14B, (c) and (d) show how the initial primerspresent (initial primers) are modified to give different primers(extended primers). In this example, those initial primers having thesequence 5′-AAA are modified to produce two different types of extendedprimers, having the sequences 5′-AAAGCC and 5′-AAATAC respectively. Thisis achieved by the hybridisation of oligonucleotide templates, 5′-GTATTTand 5′-GGCTTT to the primary primers immobilised on the surface (FIG.14A, (b)), followed by DNA polymerase reaction. Those initial primershaving the sequence 5′-GGG are modified to produce two different typesof extended primers, having the sequences 5′-GGGTAT and 5′-GGGTAA (FIG.14B, (d)) in a similar manner.

The technique of producing extended primers is useful for transformingimmobilised oligonucleotides provided on a DNA chip or other surfaceinto immobilised primers useful in amplifying a particular targetnucleic acid sequence and/or in amplifying a complementary strandthereto.

G. Preparation of Nucleic Acid Fragments

Apparatuses of the present invention can be used for various proceduressome of which will be described later on. Nucleic acid fragments for usein colony generation may be prepared differently for the differentprocedures (referred to herein as “prepared nucleic acids”). Variouspreparation procedures are described below:

(i) Preparation of Random DNA Fragments

Here is described a method to prepare DNA originating from onebiological sample (or from a plurality of samples) for amplification inthe case where it is not necessary to keep track of the origin of theDNA when it is incorporated within a colony.

The DNA of interest is first extracted from the biological sample andcut randomly into “small” pieces (e.g., 50 to 10,000 bases long, butpreferentially 500 to 1000 base pairs in length, represented by bar ‘I’,FIG. 15, (a)). (This can be done e.g., by a phenol-chloroform extractionfollowed by ultrasound treatment, mechanical shearing, by partialdigestion with frequent cutter restriction endonucleases or othermethods known by those skilled in the art). In order to standardiseexperimental conditions, the extracted and cut DNA fragments can besize-fractionated, e.g., by agarose gel electrophoresis, sucrosegradient centrifugation or gel chromatography. Fragments obtained withina single fraction can be used in providing templates in order to reducethe variability in size of the templates.

Secondly, the extracted, cut and (optionally) sorted template DNAfragments can be ligated with oligonucleotide linkers (IIa and IIb, FIG.15, (a)) containing the sequence of the primer(s) which have previouslybeen grafted onto a support. This can be achieved, for instance, using“blunt-end” ligation. Alternatively, the template DNA fragments can beinserted into a biological vector at a site that is flanked by thesequence of the primers that are grafted on the support. This cloned DNAcan be amplified within a biological host and extracted. Obviously, ifone is working with a single primer grafted to the solid support for DNAcolony formation, purifying fragments containing both P1 and P2 primersdoes not pose a problem.

Hereafter, the DNA fragments obtained after such a suitable process aredesignated by the expression: “prepared genomic DNA” (III, FIG. 15,(a)).

(ii) Preparation of Random DNA Fragments Originating from a Plurality ofSamples

Here it is described how to prepare DNA originating from a plurality ofbiological samples in the case where it is necessary to keep track ofthe origin of the DNA when it is incorporated within a colony.

The procedure is the same as that described in the previous sectionexcept that in this case, the oligonucleotide linkers used to tail therandomly cut genomic DNA fragments are now made of two parts; thesequence of the primers grafted onto the surface (P1 and P2, FIG. 15,(b)) and a “tag” sequence which is different for each sample and whichwill be used for identifying the origin of the DNA colony. Note that foreach sample, the tag may not be unique, but a plurality of tags could beused. Hereafter, we will designate the DNA fragments obtained after sucha suitable process by the expression “tagged genomic DNA” (III, FIG. 15,(b)).

This tagging procedure can be used for providing colonies carrying ameans of identification which is independent from the sequence carriedby the template itself. This could also be useful when some colonies areto be recovered specifically (using the procedure given in section D).This could also be useful in the case the recovered colonies are furtherprocessed, e.g., by creating new primary colonies and a cross referencebetween the original colonies and the new colonies is desired.

(iii) Preparation of DNA Fragments Corresponding to a Plurality of DNASequences Originating from One Sample

The DNA of interest can first be extracted from a biological sample byany means known by those skilled in the art (as mentioned supra). Thenthe specific sequences of interest can be amplified with PCR (step (i),FIG. 15, (c)) using PCR primers (IIa and IIb) made of two parts; 1) atthe 5′-end, the sequences corresponding to the sequences of primeroligonucleotide(s) that have been grafted onto a surface (P1 and P2) and2) at the 3′-end, primer sequences specific to the sequence of interest(S1 and S2). Hereafter, we will designate the DNA fragments obtainedafter such a suitable process by the expression: “prepared DNA” (III,FIG. 15 (c)).

(iv) Preparation of a Plurality of DNA Fragments Originating from aPlurality of Samples

The procedure is the same as in the previous section except that in thiscase the DNA primers (IIa and IIb) used to perform the PCR amplification(step (i), FIG. 13, (D)) are now made of three parts; 1) the sequence ofthe primers grafted onto the surface (P1 and P2), 2) a “tag” sequencewhich is different for each sample and which will be used for theidentifying the origin of the DNA colony and 3) primer sequencessurrounding the specific sequence of interest (S1 and S2). Note that foreach sample, a plurality of tags might be used, as in (ii) supra.

Hereafter, we will designate the DNA fragments obtained after such asuitable process by the expression: “tagged DNA” (III, FIG. 13, (d)).Potential uses of tags are the same as in (ii), supra.

(v) Preparation of mRNA

The procedure is similar to the procedures described for preparing DNAfragments in the previous sections except that the starting point is toextract mRNA by any means known to those skilled in the art (e.g., byuse of commercially available mRNA preparation kits). The mRNA can becopied into double-stranded cDNA by any means known to those skilled inthe art (e.g. by using a reverse transcriptase and a DNA polymerase).Certainly, the tags and primers described supra can be used inconjunction with the process of double-stranded cDNA synthesis to allowtheir incorporation into the templates. Hereafter, we will designate themRNA fragments obtained after such suitable processes by theexpressions: “prepared total mRNA” (cf. “prepared genomic DNA”, asdescribed in section (I) supra), “tagged total mRNA”, (cf. “taggedgenomic DNA”, as described in section (ii) supra), “prepared mRNA” (cf.“prepared DNA”, as described in section (iii) supra) and “tagged mRNA”(cf. “tagged DNA”, as described in section (iv) supra).

H. Preferred Detection Assays

In assay procedures of the present invention labels may be used toprovide detectable signals. Examples include:

-   a) a fluorescent group or a energy-transfer based fluorescence    system.-   b) a biotin based system. In this case colonies can be incubated    with streptavidin labelled with a fluorescent group or an enzyme    (e.g. fluorescent latex beads coated with streptavidin; streptavidin    labelled with fluorescent groups; enzymes for use with the    corresponding fluorescence assay).-   c) a system based on detecting an antigen or a fragment thereof—e.g.    a hapten (including biotin and fluorescent groups). In this case    colonies can be incubated with antibodies (e.g. specific for a    hapten). The antibodies can be labelled with a fluorescent group or    with an enzyme (e.g. fluorescent latex beads coated with the    antibody; antibodies labelled with fluorescent groups; antibodies    linked to an enzyme for use with a corresponding fluorescence or    luminescence assay, etc.).-   d) a radio-label (e.g. incorporated by using a 5′polynucleotide    kinase and [y-³²P]adenosine triphosphate or a DNA polymerase and    [α-³²p or α-³³p) deoxyribonucleoside triphosphates to add a    radioactive phosphate group(s) to a nucleic acid). Here colonies can    be incubated with a scintillation liquid.-   e) a dye or other staining agent.

Labels for use in the present invention are preferably attached

-   a) to nucleic acids-   b) to proteins which bind specifically to double stranded DNA (e.g.,    histones, repressors, enhancers)    and/or-   c) to proteins which bind specifically to single stranded DNA (e.g.    single-stranded nucleic acid binding protein).

Labelled colonies are preferably detected by:

-   a) measuring fluorescence.-   b) measuring luminescence.-   c) measuring radioactivity-   d) measuring flow or electric field induced fluorescence anisotropy.    and/or-   e) measuring the polymer layer thickness.

Staining agents can be used in the present invention. Thus DNA coloniescan be incubated with a suitable DNA-specific staining agent, such asthe intercalating dyes, ethidium bromide, YO-YO, YO-PRO (MolecularProbes, Eugene, Oreg.). With certain staining agents the result can beobserved with a suitable fluorescence imaging apparatus.

Examples of particular assays/procedures will now be described ingreater detail:

I. Preferred Embodiments of Assays of the Present Invention (i) NucleicAcid Probe Hybridisation Assay

DNA colonies are first prepared for hybridisation. Then they arehybridised with a probe (labelled or unlabelled). If required, thehybridised probed is assayed, and the result is observed. This can bedone with an apparatus of the present invention (e.g. as describedsupra).

Preparation for Hybridisation

In a preferred embodiment of the present invention colonies are treatedwith a DNA restriction endonuclease which is specific either for asequence provided by a double stranded form of one of the primersoriginally grafted onto the surface where colonies are formed or foranother sequence present in a template DNA molecule (see e.g. FIG. 12B,(c)).

After restriction enzyme digestion, the colonies can be heated to atemperature high enough for double stranded DNA molecules to beseparated. After this thermal denaturing step, the colonies can bewashed to remove the non-hybridised, detached single-stranded DNAstrands, leaving a remaining attached single-strand DNA.

In another embodiment the colonies can be partially digested with adouble-strandspecific 3′ to 5′ DNA exonuclease (see section E, FIG. 12C,(f)) which removes one strand of DNA duplexes starting from the 3′ end,thus leaving a part of a DNA molecule in single stranded form.

Alternatively, DNA in colonies can first be heat denatured and thenpartially digested with an single-strand specific 3′ to 5′ DNAexonuclease which digests single stranded DNA starting from the 3′ end.

A further alternative is simply to heat denature DNA in the colonies.

Hybridisation of the Probe

Single-stranded nucleic acid probes (labelled or unlabelled) can behybridised to single-stranded DNA in colonies at the appropriatetemperature and buffer conditions (which depends on the sequence of eachprobe, and can be determined using protocols known to those skilled inthe art).

Assaying of Unlabelled Hybridised Probes

A hybridised probe provided initially in unlabelled form can be used asa primer for the incorporation of the different (or a subset of thedifferent) labelled (or a mix of labelled and unlabelled)deoxyribonucleoside triphosphates with a DNA polymerase. Theincorporated labelled nucleotides can then be detected as describedsupra.

Cyclic Assaying of Labelled or Unlabelled Probes

Firstly, the DNA colonies can be prepared for hybridisation by themethods described supra. Then they can be hybridised with a probe(labelled or initially unlabelled). If required, hybridised labelledprobes are assayed and the result is observed with an apparatus asdescribed previously. The probe may then be removed by heat denaturingand a probe specific for a second DNA sequence may be hybridised anddetected. These steps maybe repeated with new probes as many times asdesired.

Secondly, the probes can be assayed as described supra for unlabelledprobes, except that only a subset (preferably 1 only) of the different(labelled or unlabelled) nucleotides are used at each cycle. Thecolonies can then be assayed for monitoring the incorporation of thenucleotides. This second process can be repeated until a sequence of adesired length has been determined.

(ii) In Situ RNA Synthesis Assay

In this embodiment, DNA colonies can be used as templates for in situRNA synthesis as depicted in FIG. 16, (a). DNA colonies can be generatedfrom templates and primers, such that a RNA polymerase promoter sequenceis positioned at one end of the double-stranded DNA in the colony. DNAcolonies can then be incubated with RNA polymerase and the newlysynthesised RNA (cRNA) can be assayed as desired. The detection can bedone non-specifically (e.g., staining) or in a sequence dependent way(e.g., hybridisation).

The DNA template (I, FIG. 16, (a)) to be amplified into a colony isgenerated by peR reaction using primers (IIa and IIb) which have thefollowing four parts; 1) sequence identical to the sequences of theprimers grafted onto the surface (‘P1’ and ‘P2’), 2) a “tag” sequencewhich is different for each sample, a sequence corresponding to a RNApolymerase promoter, i.e. the T3, T7 and SP6 RNA promoters, (‘RPP’,FIGS. 16, (a)) and 4) primer sequences surrounding the specific sequenceof interest (‘S1’ and ‘S2’). Hereafter, we will designate the DNAfragments obtained after such a suitable process by the expression:“tagged RNA synthesis DNA” (III, FIG. 16, (b)).

After amplification of the DNA template from the original DNA sample,these templates are used to generate DNA colonies. The DNA colonies (IV,FIG. 16, (c)) are then incubated with the RNA polymerase specific forthe RNA polymerase promoter (‘RPP’, FIG. 16, (c)). This will generate acopy of RNA specific for the DNA colony template (Template-cRNA, V, FIG.16, (d)).

cRNA thus synthesised can be isolated and used as hybridisation probes,as messenger RNA (mRNA) templates for in vitro protein synthesis or astemplates for in situ RNA sequence analysis.

(iii) Methods for Sequencing

In another embodiment of the present invention, colonies can be analysedin order to determine sequences of nucleic acid molecules which form thecolonies. Since very large numbers of the same nucleic acid moleculescan be provided within each colony the reliability of the sequencingdata obtained is likely to be very high.

The sequences determined may be full or partial. Sequences can bedetermined for nucleic acids present in one or more colonies. Aplurality of sequences may be determined at the same time.

In some embodiments the sequence of a complementary strand to a nucleicacid strand to be sequenced (or of a part thereof) may be obtainedinitially. However this sequence can be converted using base-pairingrules to provide the desired sequence (or a part thereof). Thisconversion can be done via a computer or via a person. It can be doneafter each step of primer extension or can be done at a later stage.

Sequencing can be done by various methods. For example methods relyingon sequential restriction endonuclease digestion and linker ligation canbe used. One such method is disclosed in WO95/27080 for example. Thismethod comprises the steps of: ligating a probe to an end of apolynucleotide, the probe having a nuclease recognition site;identifying one or more nucleotides at the end of the polynucleotide;and cleaving the polynucleotide with a nuclease recognising the nucleaserecognition site of the probe such that the polynucleotide is shortenedby one or more nucleotides.

However in a preferred method of the present invention, amplifiednucleic acid molecules (preferably in the form of colonies, as disclosedherein) are sequenced by allowing primers to hybridise with the nucleicacid molecules, extending the primers and detecting the nucleotides usedin primer extension. Preferably, after extending a primer by a singlenucleotide, the nucleotide is detected before a further nucleotide isused in primer extension (step-by-step sequencing).

One or more of the nucleotides used in primer extension may be labelled.The use of labelled nucleotides during primer extension facilitatesdetection. (The term n label” is used in its broad sense to indicate anymoiety that can be identified using an appropriate detection system.Preferably the label is not present in naturally-occurring nucleotides.)Ideally, labels are non-radioactive, such as fluorophores. Howeverradioactive labels can be used.

Where nucleotides are provided in labelled form the labels may be thesame for different nucleotides. If the same label is used eachnucleotide incorporation can be used to provide a cumulative increase ofthe same signal (e.g. of a signal detected at a particular wavelength).Alternatively different labels may be used for each type of nucleotide(which may be detected at different wavelengths).

Thus four different labels may be provided for dATP, dTTP, dCTP anddGTP, or the same label may be provided for them all. Similarly, fourdifferent labels may be provided for ATP, UTP, CTP and GTP, or the samelabel may be provided for them all). In some embodiments of the presentinvention a mixture of labelled and unlabelled nucleotides may beprovided, as will be described in greater detail later on.

In a preferred embodiment of the present invention the sequencing ofnucleic acid molecules present in at least 2 different colonies isperformed simultaneously. More preferably, sequencing of nucleic acidmolecules present in over 10, over 100, over 1000 or even over 1,000,000different colonies is performed simultaneously. Thus if colonies havingdifferent nucleic acids molecules are provided, many different sequences(full or partial) can be determined simultaneously—i.e. over 10, over100, over 1000 or even over 1,000,000 different sequences may bedetermined simultaneously.

If desired, controls may be provided, whereby a plurality of coloniescomprising the same nucleic acid molecules are provided. By 50determining whether or not the same sequences are obtained for nucleicacid molecules in these colonies it can be ascertained whether or notthe sequencing procedure is reliable.

One sequencing method of the present invention is illustrated in FIG.17, which is entitled “in situ sequencing”. On prepared DNA colonieshybridised with an appropriate sequencing primer, cyclic addition of theindividual deoxyribonucleoside triphosphates and DNA polymerase willallow the determination of the DNA sequence immediately 3′ to thesequencing primer. In the example outlined in FIG. 17, the addition ofdGTP allows the determination of colony 1 to contain a ‘G’. In thesecond cycle addition of dATP is detected in both colonies, determiningthat both colonies have an ‘A’ in the next position. After severalrepetitions of the addition of single deoxyribonucleoside triphosphates,it will be possible to determine any sequence. For example sequences ofat least 10, at least 20, at least 50 or at least 100 bases may bedetermined.

If colonies are provided initially in a form comprising doublestrandedmolecules the colonies can be processed to provide single-strandedmolecules for use in sequencing as described above. (It should howeverbe noted that double stranded molecules can be used for sequencingwithout such processing. For example a double stranded DNA molecule canbe provided with a promoter sequence and step-bystep sequencing can thenbe performed using an RNA polymerase and labeled ribonucleotides (cfFIG. 16, (d))). Another alternative is for a nick to be introduced in adouble stranded DNA molecule so that nick translation can be performedusing labeled deoxyribonucleotides. and a DNA polymerase with 5′ to 3′exonuclease activity.)

One way of processing double-stranded molecules present in colonies toprovide single-stranded colonies as described later with reference toFIG. 18. Here double-stranded immobilised molecules present in a colony(which may be in the form of bridge-like structures) are cleaved andthis is followed by a denaturing step. (Alternatively a denaturing stepcould be used initially and could be followed by a cleavage step).Preferably cleavage is carried out enzymatically. However other means ofcleavage are possible, such as chemical cleavage. (An appropriatecleavage site can be provided in said molecule.) Denaturing can beperformed by any suitable means. For example it may be performed byheating and/or by changing the ionic strength of a medium in thevicinity of the nucleic acid molecules.

Once single-stranded molecules to be sequenced are provided, suitableprimers for primer extension can be hybridised thereto. Oligonucleotidesare preferred as primers. These are nucleic acid molecules that aretypically 6 to 60, e.g. 15 to 25 nucleotides long. They may comprisenaturally and/or non-naturally occurring nucleotides. (However othermolecules, e.g. longer nucleic acid strands may alternatively be used asprimers, if desired.) The primers for use in sequencing preferablyhybridise to the same sequences present in amplified nucleic acidmolecules as do primers that were used to provide said amplified nucleicacids. (Primers having the same/similar sequences can be used for bothamplification and sequencing purposes).

When primers are provided in solution and are annealed (hybridised) tonucleic acid molecules present in colonies to be sequenced, thoseprimers which remain in solution or which do not anneal specifically canbe removed after annealing. Preferred annealing conditions (temperatureand buffer composition) prevent non-specific hybridisation. These may bestringent conditions. Such conditions would typically be annealingtemperatures close to a primer's Tm (melting temperature) at a givensalt concentration (e.g. 50 nM primer in 200 mM NaCl buffer at 55° C.for a 20-mer oligonucleotide with 50% GC content). (Stringent conditionsfor a given system can be determined by a skilled person. They willdepend on the base composition, GC content, the length of the primerused and the salt concentration. For a 20 base oligonucleotide of 50%GC, calculated average annealing temperature is 55-60° C., but inpractice may vary between 35 to 70° C.)

Primers used for primer extension need not be provided in solution,since they can be provided in immobilised form. In this embodiment theprimers should be provided in the vicinity of the immobilised moleculesto which they are to be annealed. (Such primers may indeed already bepresent as excess immobilised primers that were not used in amplifyingnucleic acid molecules during the formation of colonies.)

The nucleic acid molecules present in colonies to be sequenced willinclude a sequence that hybridises to the primers to be used insequencing (preferably under “stringent” conditions). This portion canbe added to a given molecule prior to amplification (which molecule mayhave a totally/partially unknown sequence) using techniques known tothose skilled in the art. For example it can be synthesised artificiallyand can be added to a given molecule using a ligase.

Once a nucleic acid molecule annealed to a primer is provided, primerextension can be performed. RNA or DNA polymerases can be used. DNApolymerases are however the enzymes of choice for preferred embodiments.Several of these are commercially available. Polymerases which lack 3′to 5′ exonuclease activity can be used, such as T7 DNA polymerase or thesmall (Klenow) fragment of DNA polymerase I may be used [e.g. themodified T7 DNA polymerase Sequenase™ 2.0 (Amersham) or Klenow fragment(3′ to 5′ exo-, New England Biolabs)). However it is not essential touse such polymerases. Indeed, where it is desired that the polymeraseshave proof-reading activity polymerases lacking 3′ to 5′ exonucleaseactivity would not be used. Certain applications may require the use ofthermostable polymerases such as ThermoSequenase™ (Amersham) orTaquenase™ (ScienTech, St Louis, Mo.). Any nucleotides may be used forprimer extension reactions (whether naturally occurring or nonnaturallyoccurring). Preferred nucleotides are deoxyribonucleotides; dATP, dTTP,dGTP and dCTP (although for some applications the dTTP analogue dUTP ispreferred) or ribonucleotides ATP, UTP, GTP and CTP; at least some ofwhich are provided in labelled form.

A washing step is preferably incorporated after each primer extensionstep in order to remove unincorporated nucleotides that may interferewith subsequent steps. The preferred washing solution should becompatible with polymerase activity and have a salt concentration thatdoes not interfere with the annealing of primer molecules to the nucleicacid molecules to be sequenced. (In less preferred embodiments, thewashing solution may interfere with polymerase activity. Here thewashing solution would need to be removed before further primerextension.)

Considering that many copies of molecules to be sequenced can beprovided in a given colony, a combination of labelled and nonlabellednucleotides can be used. In this case, even if a small proportion of thenucleotides are labelled (e.g. fluorescence labelled), the number oflabels incorporated in each colony during primer extension can besufficient to be detected by a detection device. For example the ratioof labelled to non-labelled nucleotides may be chosen so that, onaverage, labelled nucleotides are used in primer extension less than50%, less than 20%, less than 10% or even less than 1% of the time (i.e.on average in a given primer extension step a nucleotide is incorporatedin labelled form in less than 50%, less than 20%, less than 10%' 0 orless than 1% of the extended primers.)

Thus in a further embodiment of the present invention there is provideda method for sequencing nucleic acid molecules present in a colony ofthe present invention, the method comprising the steps of:

-   a) providing at least one colony comprising a plurality of single    stranded nucleic acid molecules that have the same sequences as one    another and that are hybridised to primers in a manner to allow    primer extension in the presence of nucleotides and a nucleic acid    polymerase;-   b) providing said at least one colony with a nucleic acid polymerase    and a given nucleotide in labelled and unlabelled form under    conditions that allow extension of the primers if a complementary    base or if a plurality of such bases is present at the appropriate    position in the single stranded nucleic acid molecules present in    said at least one colony;-   c) detecting whether or not said labelled nucleotide has been used    for primer extension by determining whether or not the label present    on said nucleotide has been incorporated into extended primers;

Steps b) and c) may be repeated one or more times. Preferably aplurality of different colonies are provided and several differentsequences are determined simultaneously.

This further embodiment of the present invention can be used to reducecosts, since relatively few labelled nucleotides are needed. It can alsobe used to reduce quenching effects.

It is however also possible to use only labelled nucleotides for primerextension or to use a major portion thereof (e.g. over 50%, over 70% orover 90% of the nucleotides used may be labelled). This can be done forexample if labels are selected so as to prevent or reduce quenchingeffects. Alternatively labels may be removed or neutralised at variousstages should quenching effects become problematic (e.g. laser bleachingof fluorophores may be performed). However this can increase the numberof steps required and it is therefore preferred that labels are notremoved (or at least that they are not removed after each nucleotide hasbeen incorporated but are only removed periodically). In other lesspreferred embodiments, the primer itself and its extension product maybe removed and replaced with another primer. If required, several stepsof sequential label-free nucleotide additions may be performed beforeactual sequencing in the presence of labelled nucleotides is resumed. Afurther alternative is to use a different type of label from that usedinitially (e.g. by switching from fluorescein to rhodamine) shouldquenching effects become problematic.

In preferred embodiments of the present invention a plurality oflabelled bases are incorporated into an extended primer duringsequencing. This is advantageous in that it can speed up the sequencingprocedure relative to methods in which, once a labelled base has beenincorporated into an extended primer, the label must be removed before afurther labelled base can be incorporated. (The plurality of labelledbases may be in the form of one or more contiguous stretches, althoughthis is not essential.)

The present invention therefore also includes within its scope a methodfor sequencing nucleic acid molecules, comprising the steps of:

-   a) using a first colony to provide a plurality of single stranded    nucleic acid molecules that have the same sequences as one another    and that are hybridised to primers in a manner to allow primer    extension in the presence of nucleotides and a nucleic acid    polymerase;-   b) using a second colony to provide a plurality of single stranded    nucleic acid molecules that have the same sequences as one another,    and that are also hybridised to primers in a manner to allow primer    extension in the presence of nucleotides and a nucleic acid    polymerase;-   c) providing each colony with a nucleic acid polymerase and a given    labelled nucleotide under conditions that allow extension of the    primers if a complementary base or if a plurality of such bases is    present at the appropriate position in the single stranded nucleic    acid molecules;-   d) detecting whether or not said labelled nucleotide has been used    for primer extension at each colony by determining whether or not    the label present on said nucleotide has been incorporated into    extended primers;-   e) repeating steps c) and d) one or more times so that extended    primers comprising a plurality of labels are provided.

Preferably the sequences of the nucleic acid molecules present at saidfirst and said locations are different from one another—i.e. a pluralityof colonies comprising different nucleic acid molecules are sequenced.

In view of the foregoing description it will be appreciated that a largenumber of different sequencing methods using colonies of the presentinvention can be used. Various detection systems can be used to detectlabels used in sequencing in these methods (although in certainembodiments detection may be possible simply by eye, 50 that nodetection system is needed). A preferred detection system forfluorescent labels is a Charge-Coupled-Device (CCD) camera, which canoptionally be coupled to a magnifying device. Any other device allowingdetection and, preferably, also quantification of fluorescence on asurface may be used. Devices such as fluorescent imagers or confocalmicroscopes may be chosen.

In less preferred embodiments, the labels may be radioactive and aradioactivity detection device would then be required. Ideally suchdevices would be real-time radioactivity imaging systems. Also lesspreferred are other devices relying on phosphor screens (MolecularDynamics) or autoradiography films for detection.

Depending on the number of colonies to be monitored, a scanning systemmay be preferred for data collection. (Although an alternative is toprovide a plurality of detectors to enable all colonies to be covered.)Such a system allows a detector to move relative to a plurality ofcolonies to be analysed. This is useful when all the colonies providingsignals are not within the field of view of a detector. The detector maybe maintained in a fixed position and colonies to be analysed may bemoved into the field of view of the detector (e.g. by means of a movableplatform). Alternatively the colonies may be maintained in fixedposition and the detection device may be moved to bring them into itsfield of view.

The detection system is preferably used in combination with an analysissystem in order to determine the number (and preferably also the nature)of bases incorporated by primer extension at each colony after eachstep. This analysis may be performed immediately after each step orlater on, using recorded data. The sequence of nucleic acid moleculespresent within a given colony can then be deduced from the number andtype of nucleotides added after each step.

Preferably the detection system is part of an apparatus comprising othercomponents. The present invention includes an apparatus comprising aplurality of labelled nucleotides, a nucleic acid polymerase anddetection means for detecting labelled nucleotides when incorporatedinto a nucleic acid molecule by primer extension, the detection meansbeing adapted to distinguish between signals provided by labellednucleotides incorporated at different colonies.

The apparatus may also include temperature control, solvent delivery andwashing means. It may be automated.

Methods of apparatuses within the scope of the present invention can beused in the sequencing of:

-   -   unidentified nucleic acid molecules (i.e. de novo sequencing);    -   and nucleic acid molecules which are to be sequenced to check if        one or more differences relative to a known sequence are present        (e.g. identification of polymorphisms). This is sometimes        referred to as “re-sequencing”.

Both de novo sequencing and re-sequencing are discussed in greaterdetail later on (see the following sections (v) and (vi).

For de novo sequencing applications, the order of nucleotides applied toa given location can be chosen as desired. For example one may choosethe sequential addition of nucleotides dATP, dTTP, dGTP, dCTP; dATP,dTTP, dGTP, dCTP; and so on. (Generally a single order of fournucleotides would be repeated, although this is not essential.) Forre-sequencing applications, the order of nucleotides to be added at eachstep is preferably chosen according to a known sequence.

Re-sequencing may be of particular interest for the analysis of a largenumber of similar template molecules in order to detect and identifysequence differences (e.g. for the analysis of recombinant plasmids incandidate clones after site directed mutagenesis or more importantly,for polymorphism screening in a population). Differences from a givensequence can be detected by the lack of incorporation of one or morenucleotides present in the given sequence at particular stages of primerextension. In contrast to most commonly used techniques, the presentmethod allows for detection of any type of mutation such as pointmutations, insertions or deletions. Furthermore, not only known existingmutations, but also previously unidentified mutations can becharacterised by the provision of sequence information.

In some embodiments of the present invention long nucleic acid moleculesmay have to be sequenced by several sequencing reactions, each oneallowing for determination of part of the complete sequence. Thesereactions may be carried out at different colonies (where the differentcolonies are each provided with the same nucleic acid molecules to besequenced but different primers), or in successive cycles applied at thesame colony (where between each cycles the primers and extensionproducts are washed off and replaced by different primers).

(iv) DNA Fingerprinting

This embodiment of the present invention aims to solve the problem ofscreening a large population for the identification of given features ofgiven genes, such as the detection of single nucleotide polymorphisms.

In one preferred embodiment, it consists in generating tagged genomicDNA (see section G(ii) supra). (Thus each sample originating from agiven individual sample has been labelled with a unique tag). Thistagged DNA can be used for generating primary colonies on an appropriatesurface comprising immobilised primers. Several successive probehybridisation assays to the colonies can then be performed. Between eachassay the preceding probe can be removed, e.g. by thermal denaturationand washing. Advantages of this embodiment of the present invention overother approaches for solving this problem are illustrated in thefollowing example of a potential practical application.

It is intended to detect which part of a gene (of, e.g., 2000 bases insize), if any, is related to a disease phenotype in a population oftypically 1,000 to 10,000 individuals. For each individual, a peRamplification can be performed to specifically amplify the gene ofinterest and to link a tag and a colony generating primer (refer tosection G(iv), preparation of “tagged DNA”).

In order to obtain a representative array of sample, one might want toarray randomly 500,000 colonies (i.e. times redundancy, so to have onlya small probability of missing the detection of a sample). With a colonydensity of 10,000 colonies per mm², a surface of ˜7 mm×7 mm can be used.This is a much smaller surface than any other technology available atpresent time (e.g. The HySeq approach uses 220 mm×220 mm for the samenumber of samples (50,000) without redundancy). The amount of reactants(a great part of the cost) will be proportional to the surface occupiedby of the array of samples. Thus the present invention can provide an800 fold improvement over the presently available technology.

Using an apparatus to monitor the result of the ‘in situ’ sequencing orprobe hybridisation assays, it should take on the order of 1 to 10seconds to image a fluorescent signal from colonies assayed usingfluorescence present on a surface of ˜1 mm². Thus, assuming that thebottleneck of the method is the time required to image the result of theassay, it takes of the order of 10 minutes to image the result of anassay on 50,000 samples (500,000 colonies). To provide 200 assaysincluding imaging (on one or several 7 mm×7 mm surfaces), using thepresent invention can take less than 36 hours. This represents a 20times improvement compared to the best method known at present time(HySeq claims 30 days to achieve a comparable task).

Improvements (colony densities 10 times higher and imaging time of 1second) could allow for much higher throughput and finally theultimately expected throughput could be about 2000 times faster than thebest, not yet fully demonstrated, technology available at present time.

Another advantage of using the present invention lies in the fact thatit overcomes the problem arising with individuals who have heterozygousmutations for a given gene. While this problem may be addressed byexisting sequencing methods to determine allelic polymorphisms, currenthigh throughput mutation detection methods based on oligonucleotideprobe hybridisation may lead to difficulties in the interpretation ofresults due to an unequal hybridisation of probes in cases of allelicpolymorphisms and therefore errors can occur. In this embodiment of thepresent invention, each colony arises from a single copy of an amplifiedgene of interest. If an average of 10 colonies are generated for eachindividual locus, there will be an average of 5 colonies correspondingto one version of a gene and 5 colonies corresponding to the otherversion of the gene. Thus heterozygotic mutations can be scored by thenumber of times a single allele is detected per individual genomesample.

(v) DNA Resequencing

This embodiment of the present invention provides a solution to theproblem of identifying and characterising novel allelic polymorphismswithin known genes in a large population of biological samples.

In its preferred embodiment it consists in obtaining tagged DNA (eachsample originating from a given individual has been tagged with a uniquetag—see section G(iv)). This encoded DNA can then be used for generatingprimary colonies on an appropriate surface comprising immobilisedprimers. Several successive assays of probe hybridisation to thecolonies can then be performed wherein between each cyclic assay thepreceding probe can be removed by thermal denaturation and washing.Preferably, the DNA sequence 3′ to a specific probe may be determineddirectly by ‘in situ sequencing’ (section I(iii), Methods ofsequencing).

The advantages of the present invention over other approaches forsolving this problem are illustrated in the following example ofpotential practical application:

It is desired to identify the variability of the sequence of a gene (of,e.g., 2 000 bases in size), if any, in a population of typically 4 000individuals. It is assumed that a reference sequence of the gene isknown. For each individual, a PCR amplification can be performed tospecifically amplify the gene of interest and link a tag and a colonygenerating primer. In order to obtain a representative array of sample,one might want to array randomly 40 000 colonies (i.e. 10 timesredundancy, so to have a small probability of missing the detection of asample). With a colony density of 10 000 colonies per mm², a surface of˜2 mm×2 mm can be used.

Using an apparatus with a CCD camera (having a 2000×2000 pixel chip) tomonitor the result of the assay, it should take of the order of 10seconds to image a fluorescent signal from colonies on a surface of 4mm². If it is assumed possible to read at least 20 bases during oneround of the assay, this requires 61 imaging steps (3n+1 imaging stepsare necessary for reading n numbers of bases). If it is assumed that thebottleneck of the method is the time to image the result of the assay,it takes of the order of 15 minutes to image the result of an assay on 4000 samples (40 000 colonies). To realise 100 assays (on one or several2×2 mm² surfaces) in order to cover the entire gene of interest, thepresent invention can allow the whole screening experiment to beperformed in approximately one day, with one apparatus. This can becompared to the most powerful systems operational at the present time.

In this embodiment of the present invention with conservativeassumptions (colony density, imaging time, size of the CCD chip), athroughput of 3.2×10⁶ bases per hour could be reached, i.e. a 400 foldimprovement when compared to the most commonly used system at presenttime (current DNA sequencers have a typical throughput of the order of8,000 bases read/hour).

(vi) De Novo DNA Sequencing

This embodiment of the present invention aims to solve the problem ofsequencing novel genomes (or parts thereof) with low cost and in shorttime, where the sequence of the DNA is not known. Genomic DNA can beprepared, either directly from the total DNA of an organism of interestor from a vector into which DNA has been inserted. The prepared genomicDNA (from whatever source) can be used to generate DNA colonies. The DNAcolonies can then be digested with a rare-cutting restriction enzyme,whose site is included in the linker, denatured and sequenced.

FIG. 18 depicts an example of de novo DNA sequencing. In this example,genomic DNA is fragmented into pieces of 100 to 2000 base pairs (seepreparation of random DNA fragments, section G(i)) These fragments willbe ligated to oligonucleotide linkers (IIa and IIb, FIG. 18, (a)) whichinclude sequences specific for the grafted primers on the surface (‘P1’and ‘P2’), a sequence which is recognised by a rare-cutting restrictionnuclease (‘RE’) and a sequence corresponding to a sequencing primer(‘SP’), resulting in templates (III, FIG. 18, (b)). Using this preparedDNA as template for DNA colony formation, one obtains primary colonies(IV, FIG. 18, (c)). These colonies are then digested with thecorresponding restriction endonuclease and denatured to remove thenon-attached DNA strand (V, FIG. 18, (d)). The sequencing primer (SP) isthen annealed to the attached single-stranded template (FIG. 18, (e)).Incorporation and detection of labeled nucleotides can then be carriedout as previously described (see section I(iii), Methods of Sequencing).

In this embodiment, the throughput obtainable can be at least 400 timeshigher than presently available methods.

(vii) mRNA Gene Expression Monitoring

This embodiment of our invention means to solve the problem ofmonitoring the expression of a large number of genes simultaneously.

Its preferred embodiment is depicted in FIG. 19.

Firstly, primary colonies are prepared, as depicted in FIG. 3. In itspreferred form, the DNA used for this preparation is ‘prepared genomicDNA’ or ‘tagged genomic DNA’, as described in section G(i) and G(iii),respectively, and where the DNA is either from the whole genome of one(or several) organism{s) or from a subset thereof (e.g., from a libraryof previously isolated genes). In FIG. 19, the uppercase letters, “A”,“B” and “D” represent colonies which have arisen from genes whichexhibit high, medium and low expression levels, respectively, and “E”represents colonies arising from nonexpressed genes (in real cases, allthese situations may not necessarily be present simultaneously).

Secondly, the colonies are treated to turn then into supports (i.e.secondary primers) for secondary colony growing (step i in FIG. 19,(a)), as described in section E. At this stage (FIG. 19, (a)), thetreated colonies are represented by underlined characters (A, B, D, orE).

Thirdly, (step ii in FIG. 19, (b)) this support for secondary colonygrowing is used to regenerate colonies from mRNA (or eDNA) templatesextracted from a biological sample, as described in section C. If thetemplate is mRNA, the priming step of colony regeneration will beperformed with a reverse transcriptase. After a given number of colonyamplification cycles, preferably 1 to 50, the situation will be asdepicted in (FIG. 19, (c)): the colonies corresponding to highlyexpressed genes (represented by the letter “A”) are totally regenerated,as their regeneration has been initiated by many copies of the mRNA; thecolonies corresponding to genes of medium expression levels (representedby the letters “b” and “B”), have been only partially regenerated; onlya few of the colonies corresponding to rare genes (represented by theletter “d”), have been partially regenerated; the colonies correspondingto non-expressed sequences (represented by the letter “E”), have notbeen regenerated at all.

Lastly, (step iii in FIG. 19, (c)), additional cycles of colony growingare performed (preferably 2 to 50), and the colonies which have not beentotally regenerated during the previous steps finally become totallyregenerated, “b” becomes “B”, “d” becomes “D” (FIG. 19, (d)): thecolonies corresponding to genes with high and medium expression levelsare all regenerated “A” and “B” or “B”; the colonies corresponding togenes with low levels of expression are not all regenerated “D” and “D”;the colonies corresponding to non-expressed sequences are notregenerated at all “E”.

The relative levels of expression of the genes can be obtained by thefollowing preferred methods:

Firstly levels of expression can be monitored by following the rate ofregeneration of the colonies (i.e., by measuring the amount of DNAinside a colony after different number of colony growing cycles duringstep (iii)) as the rate at which a colony is regenerated will be linkedthe number of mRNA (or cDNA) molecules Which initiated the regenerationof that colony (at first approximation, the number of DNA moleculesafter n cycles, noted M(n), in a colony undergoing regeneration shouldbe given by M(n)=M₀R^((N-1)), where M₀ is the number of molecules whichinitiated the regeneration of the colony, r is the growing rate and n isthe number of cycles);

Secondly, levels of expression can be monitored by counting, for eachgene, the number of colonies which have been regenerated and comparingthis number to the total number of colonies corresponding to that gene.These measurements will generally give access to the relative expressionlevels of the genes represented by the colonies. The identification ofthe colonies is preferably performed by fingerprinting, in a manneressentially similar to embodiment, section I(iv). Note that encoding theDNA samples is not required, but can be considered as an alternative tothe direct identification of the DNA in the colonies. This can be ofpractical interest because with coding, the same codes (thus the sameoligonucleotides involved in assaying the code) can be used for any setof genes, whereas without code, a different set of specificoligonucleotides has to be used for each set of genes.

This embodiment of our invention has many advantages if compared tocurrent state of the art including: a very high throughput; norequirement for prior amplification of the mRNA (even though prioramplification is compatible with or invention); small amounts of samplesand reactants are required due to the high density of samples with ourinvention; the presence of highly expressed genes has no incidence onthe ability to monitor genes with low levels of expression; the abilityto simultaneously monitor low and high levels of expression within theset of genes of interest.

When the initial DNA in the generation of the primary DNA colony is madefrom the DNA of a whole genome, this embodiment also provides thefollowing features: there is no interference between genes expressed athigh level and at low level even though one has not performed specificamplification of the genes of interest. This is a unique feature of theuse of this invention: specific amplification is not possible becausethe initial assumption of this embodiment is to monitor the expressiongenes which may have not yet been isolated, thus which are unknown, andthus for which no specific (unique) sequences are known and whichspecific sequences would have been necessary for specific geneamplification. The ability of our invention to perform this type of mRNAexpression monitoring is due to the fact that when the primary coloniesare prepared, statistically, each piece of the initial genome will berepresented by the same number of colonies. Thus, frequent and rare DNAwill initiate the same number of colonies (e.g., one colony per addedgenome molecule). Quantitative information might be obtained both fromfrequent and rare mRNAs by monitoring the growing rate of the colonies.

(viii) Isolation and Characterisation of Novel Expressed Genes

This embodiment of our invention means to solve the problem of isolatingthe genes which are specifically induced under given conditions, e.g.,in specific tissues, different strains of a given species or underspecific activation. A practical example is the identification of geneswhich are up or down regulated after drug administration.

The preferred embodiment for isolating genes from a specific oractivated biological sample (hereafter called target sample) which areup-regulated compared with a reference biological sample (hereaftercalled reference sample) is depicted in FIG. 20.

Firstly, primary colonies are prepared (FIG. 20, (a)). In its preferredform, the DNA used for this preparation is prepared genomic DNA ortagged genomic DNA, as described in sections G(i) and G(ii),respectively, where the DNA is either from the whole genome of one (orseveral) organism(s) or from a subset thereof (e.g., from a library ofpreviously isolated genes), and where both the primers used for colonygeneration (hereafter called P1 and P2) contain a endonucleaserestriction site. In FIG. 20, (a), “A” represents colonies which havearisen from genes expressed in both the reference sample and the targetsample, “B” represents colonies which have arisen from genes expressedonly in the reference sample, “C” represents colonies which have arisenfrom genes expressed only in the target sample, and “D” representscolonies arising from non-expressed genes (in real cases, all thesesituations may not necessarily be present simultaneously).

Secondly, primary colonies are then treated to generate secondaryprimers as the support for secondary colony growing (step i in FIG. 20,(a)). At this stage (b), the colonies are represented as underlinedcharacters (A, B, C, D).

Thirdly, (step ii in FIG. 20, (b)) the secondary primers are used toregenerate colonies using mRNA or cDNA (represented by “mA+mB”)extracted from the biological reference sample as a template, asdescribed in G(v). If the template is mRNA, the first elongation step ofcolony regeneration will be performed with a reverse transcriptase.After enough colony growing cycles, preferably 5 to 100, only thecolonies corresponding to genes expressed in the reference sample (“A”and “8”) will be regenerated, as depicted in (FIG. 20, (c)).

In step (iii), the colonies are digested with a restriction enzyme(represented by RE) which recognises a site in the flanking primersequences, P1 and P2, which are grafted on the support and which werethe basis of primary colony generation. Importantly, only the colonieswhich have been regenerated during step (ii) will be digested. This isbecause the support for secondary colony growth is made of singlestranded DNA molecules, which can not be digested by the restrictionenzyme. Only the regenerated colonies are present in a double strandedform, and are digested. After digestion, the situation is the onedepicted in FIG. 20, (d). The colonies corresponding to the genesexpressed in the reference sample have totally disappeared, i.e., theyare not even present as a support for secondary colony growth, and thecolonies corresponding to genes expressed only in the target sample “C”and the colonies corresponding to non-expressed genes “D” are stillpresent as a support for secondary colony generation.

In step (iv), mRNA (or eDNA) (represented by “MA+mC”) extracted from thetarget sample is used to generate secondary colonies. Because coloniescorresponding to rnA and mB no longer exist, only the coloniescorresponding to mC can be regenerated (i.e., only the mRNA specificallyexpressed in the target sample). After sufficient number of colonygrowing cycles (preferably 5 to 100), the situation is such that onlythe colonies corresponding to genes expressed specifically in the targetsample are regenerated (“C”, FIG. 20, (e)).

In step (v), the regenerated colonies “C” are used to generate copies ofthe DNA that they contain by performing several (preferably 1 to 20)colony growing cycles in the presence of the primers P1 and P2, asdescribed in section D of the present invention. A PCR amplification isthen performed using P1 and P2 in solution (described in section D) andthe amplified DNA characterised by classical methods.

The preferred embodiment for isolating genes from a specific oractivated biological sample which are less expressed than in a referencebiological sample is depicted in FIG. 21. The different steps involvedin this procedure are very similar to those involved in the isolation ofgene which are more regulated than in the reference sample, and thenotation are the same as in FIG. 20. The only difference is to inversethe order used to regenerate the colonies: in step (ii), the mRNA usedis the one extracted from the target biological sample (“MA+mC”) insteadof the mRNA extracted from the reference biological sample (“mA+mB”),and in step (iv), the mRNA used is the one extracted from the referencebiological sample (“mA+mB”) instead of the one extracted from the targetsample (“mA+mC”). As a result, only the DNA from colonies correspondingto genes which are expressed in the reference sample but not in thetarget sample is recovered and amplified (“B”, FIG. 19 f).

1. (canceled)
 2. A multiplex method for determining the presence of oneor more target polynucleotide sequences across a plurality of samples,comprising: amplifying and tagging target polynucleotides by PCR in eachof said plurality of samples with an amplification primer comprising ahigh throughput sequencing adaptor, a sample-specific tag sequence, anda priming sequence to amplify said target polynucleotide sequence(s);combining the amplified polynucleotides and sequencing thepolynucleotide pool in high throughput, so as to determine the sequenceof at least 100 tagged polynucleotides for each of said samples;assigning the nucleotide sequences to the originating samples by thenucleotide sequence of the sample-specific tag, thereby determining thepresence of the target polynucleotide sequence(s) across the samples. 3.The method of claim 2, wherein the samples are diagnostic.
 4. The methodof claim 2, wherein the samples are biological samples.
 5. The method ofclaim 2, wherein the target polynucleotides are genomic DNA ormitochondrial DNA.
 6. The method of claim 2, wherein the targetpolynucleotides are cDNA.
 7. The method of claim 2, wherein the numberof samples is over
 10. 8. The method of claim 2, wherein the number ofsamples is over
 100. 9. The method of claim 2, wherein the primingsequence is the same across samples.
 10. The method of claim 2, whereinthe sequencing adaptor immobilizes the individual polynucleotides forclonal amplification and sequencing.
 11. The method of claim 10, whereinthe sequencing is sequencing-by-synthesis, sequencing-by-ligation, orsequencing-by-hybridization.
 12. The method of claim 2, wherein the PCRamplifications employ a forward and reverse primer that both comprisethe high throughput sequencing adaptor and the sample-specific tag. 13.A method for simultaneously determining the sequence of one or moresequences of interest in a plurality of samples, comprising: amplifyingand tagging polynucleotide sequences of interest by PCR in each of saidplurality of samples with an amplification primer comprising a sequencecomplementary to a sequencing primer for high throughput sequencing, atag sequence which is different for each sample, and primer sequencessurrounding the specific sequences of interest to amplify said specificsequences of interest; sequencing the amplified polynucleotide sequencesof interest from the plurality of samples in high throughput, so as todetermine the sequence of over 100 different sequences from theplurality of samples; and identifying the origin of the amplifiedpolynucleotide sequences of interest using the tag sequence which isdifferent for each sample.
 14. The method of claim 13, wherein the over100 different sequences are from each of the plurality of samples. 15.The method of claim 13, wherein the method is for diagnosis.
 16. Themethod of claim 13, wherein the samples are biological samples.
 17. Themethod of claim 13, wherein the polynucleotide sequences of interest aregenomic DNA or mitochondrial DNA.
 18. The method of claim 13, whereinthe polynucleotide sequences of interest are cDNA.
 19. The method ofclaim 13, wherein the number of samples is over
 10. 20. The method ofclaim 13, wherein the number of samples is over
 100. 21. The method ofclaim 13, wherein the primer sequences surrounding the specificsequences of interest to amplify said specific sequences of interest isthe same across samples.
 22. The method of claim 13, wherein thesequence complementary to a sequencing primer for high throughputsequencing immobilizes the individual polynucleotides for clonalamplification and sequencing.
 23. The method of claim 22, wherein thesequencing is sequencing-by-synthesis, sequencing-by-ligation, orsequencing-by-hybridization.
 24. The method of claim 13, wherein the PCRamplifications employ a forward and reverse primer that both comprisethe sequence complementary to a sequencing primer for high throughputsequencing and the tag sequence which is different for each sample. 25.The method of claim 13, wherein over 1000 different sequences from theplurality of samples are determined.